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Preface 


For undergraduate students, the transition from calculus to analysis is often 
disorienting and mysterious. What happened to the beautiful calculus for- 
mulas? Where did €-d and open sets come from? It is not until later that one 
integrates these seemingly distinct points of view. When teaching “advanced 
calculus,” I always had a difficult time answering these questions. 

Now, every mathematician knows that analysis arose naturally in the nine- 
teenth century out of the calculus of the previous two centuries. Believing that 
it was possible to write a book reflecting, explicitly, this organic growth, I set 
out to do so. 

I chose several of the jewels of classical eighteenth- and nineteenth-century 
analysis and inserted them near the end of the book, inserted the axioms for 
reals at the beginning, and filled in the middle with (and only with) the 
material necessary for clarity and logical completeness. In the process, every 
little piece of one-variable calculus assumed its proper place, and theory and 
application were interwoven throughout. 

Let me describe some of the unusual features in this text, as there are other 
books that adopt the above point of view. First is the systematic avoidance of 
e-d arguments. Continuous limits are defined in terms of limits of sequences, 
limits of sequences are defined in terms of upper and lower limits, and upper 
and lower limits are defined in terms of sup and inf. Everybody thinks in terms 
of sequences, so why do we teach our undergraduates ¢-6’s? (In calculus texts, 
especially, doing this is unconscionable.) 

The second feature is the treatment of integration. We follow the standard 
treatment motivated by geometric measure theory, with a few twists thrown 
in: The area is two-dimensional Lebesgue measure, defined on all subsets of 
R’, the integral of an arbitrary’ nonnegative function is the area under its 
graph, and the integral of an arbitrary integrable function is the difference 
of the integrals of its positive and negative parts. 


1 Not necessarily measurable. 
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In dealing with arbitrary subsets of R? and arbitrary functions, only a few 
basic properties can be derived; nevertheless, surprising results are available; 
for example, the integral of an arbitrary integrable function over an interval 
is a continuous function of the endpoints. 

Arbitrary functions are considered to the extent possible not because of 
generality for generality’s sake, but because they fit naturally within the con- 
text laid out here. For example, the density theorem and maximal inequality 
are valid for arbitrary sets and functions, and the first fundamental theorem 
is valid for an integrable f if and only if f is measurable. 

In Chapter 4 we restrict attention to the class of continuous functions, 
which is broad enough to handle the applications in Chapter 5, and derive 
the second fundamental theorem in the form 


b 
/ f(x) dx = F(b—) — F(a+t). 


Here a, b, F'(a+) or F'(b—) may be infinite, extending the immediate applica- 
bility, and the continuous function f need only be nonnegative or integrable. 

The third feature is the treatment of the theorems involving interchange of 
limits and integrals. Ultimately, all these theorems depend on the monotone 
convergence theorem which, from our point of view, follows from the Greek 
mathematicians’ Method of Exhaustion. Moreover, these limit theorems are 
stated only after a clear and nontrivial need has been elaborated. For exam- 
ple, differentiation under the integral sign is used to compute the Gaussian 
integral. 

The treatment of integration presented here emphasizes geometric aspects 
rather than technicality. The most technical aspects, the derivation of the 
Method of Exhaustion in §4.5, may be skipped upon first reading, or skipped 
altogether, without affecting the flow. 

The fourth feature is the use of real-variable techniques in Chapter 5. We 
do this to bring out the elementary nature of that material, which is usually 
presented in a complex setting using transcendental techniques. For example, 
included is: 


e A real-variable derivation of Gauss’ AGM formula motivated by the unit 
circle map 
Pye Jax + ivby 
x +i = —— .- 
Jar — ivby 
e A real-variable computation of the radius of convergence of the Bernoulli 
series, derived via the infinite product expansion of sinh w/a, which is in 
turn derived by combinatorial real-variable methods. 
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e The zeta functional equation is derived via the theta functional equation, 
which is in turn derived via the connection to the parametrization of the 
AGM curve. 


The fifth feature is our emphasis on computational problems. Computa- 
tion, here, is often at a deeper level than expected in calculus courses and 
varies from the high school quadratic formula in §1.4 to exp(—¢/(0)) = V2a 
in §5.8. 

Because we take the real numbers as our starting point, basic facts about 
the natural numbers, trigonometry, or integration are rederived in this con- 
text, either in the body of the text or as exercises. For example, while the 
trigonometric functions are initially defined via their Taylor series, later it is 
shown how they may be defined via the unit circle. 

Although it is helpful for the reader to have seen calculus prior to reading 
this text, the development does not presume this. We feel it is important for 
undergraduates to see, at least once in their four years, a nonpedantic, purely 
logical development that really does start from scratch (rather than pretends 
to), is self-contained, and leads to nontrivial and striking results. 

Applications include a specific transcendental number; convexity, elemen- 
tary symmetric polynomial inequalities, subdifferentials, and the Legendre 
transform; Machin’s formula; the Cantor set; the Bailey-Borwein—Plouffe se- 
ries ” 

ee eee 
omer 8n+1 8n+4 8n+5 8n+6 
continued fractions; Laplace and Fourier transforms; Bessel functions; Euler’s 
constant; the AGM iteration; the gamma and beta functions; Stirling identity 


st+1 
exp (| log I(x) ax) = s*e °V2n, s > 0; 


the entropy of the binomial coefficients; infinite products and Bernoulli num- 
bers; theta functions and the AGM curve; the zeta function; the zeta series 


= ye +e -..., l>2>-l; 


log(z!) = —ye + ¢(2) ; - 


primes in arithmetic progressions; the Euler—Maclaurin formula; and the Stir- 
ling series. 

After the applications, Chapter 6 develops the fundamental theorems in 
their general setting, based on the sunrise lemma. This material is at a more 
advanced level, but is included to point the reader toward twentieth-century 
developments. 


x Preface 


As an aid to self-study and assimilation, there are 450 problems with all 
solutions at the back of the book. Every exercise can be solved using only 
previous material from this book. Chapters 1—4 provide the basis for a calcu- 
lus or beginner undergraduate analysis course. Chapters 4 and 5 provide the 
basis for an undergraduate computational analysis course, while Chapters 4 
and 6 provide the basis for an undergraduate real analysis course. 


Philadelphia, PA, USA Omar Hijab 
Fall 2015 
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A Note to the Reader 


This text consists of many assertions, some big, some small, some almost 
insignificant. These assertions are obtained from the properties of the real 
numbers by logical reasoning. Assertions that are especially important are 
called theorems. An assertion’s importance is gauged by many factors, in- 
cluding its depth, how many other assertions it depends on, its breadth, how 
many other assertions are explained by it, and its level of symmetry. The 
later portions of the text depend on every single assertion, no matter how 
small, made in Chapter 1. 

The text is self-contained, and the exercises are arranged in order: Every 
exercise can be done using only previous material from this text. No outside 
material is necessary. 

Doing the exercises is essential for understanding the material in the text. 
Sections are numbered sequentially within each chapter; for example, §4.3 
means the third section in Chapter 4. Equation numbers are written within 
parentheses and exercise numbers in bold. Theorems, equations, and exercises 
are numbered sequentially within each section; for example, Theorem 4.3.2 
denotes the second theorem in §4.3, (4.3.1) denotes the first numbered equa- 
tion in §4.3, and 4.3.3 denotes the third exercise at the end of §4.3. 
Throughout, we use the abbreviation “iff” to mean “if and only if” and 
to signal the end of a derivation. 
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Chapter 1 
The Set of Real Numbers 


1.1 Sets and Mappings 


We assume the reader is familiar with the usual notions of sets and mappings, 
but we review them to fix the notation. The reader may wish to skip this 
section altogether and refer to it as needed later in the text. 

In set theory, everything is a set and set membership is the sole primitive 
notion: Given sets a and A, either a belongs to' A or not. We write a € A 
in the first case and a ¢ A in the second. If a belongs to A, we say a is an 
element of A. 

Let A, B be sets. If every element of A is an element of B, we say A is 
a subset of B, and we write A C B. Equivalently, we say B is a superset of 
A and we write B > A. When we write A C B or A D B, we allow for the 
possibility A= B,ie., AC Aand AD A. 

Elements characterize sets: If A and B have the same elements, then 
A= B. More precisely, if AC B and BC A, then A= B. 

Given a and 8, there is exactly one set, their pair, whose elements are a 
and 0b; this set is denoted {a,b}. In particular, given a, there is the singleton 
set {a,a}, denoted {a}. The ordered pair of sets a and b is the set 


(a,b) = {1a}, {a,b}. 


Then (a,b) = (c,d) iff a= c and b = d (Exercise 1.1.9). 

There is a set J having no elements, the empty set. Note @ is a subset of 
every set. 

The union of sets A and B is the set C whose elements lie in A or lie in 
B; we write C = AUB, and we say C equals A union B. The intersection 
of sets A and B is the set C whose elements lie in A and lie in B; we write 
C = ANB and we say C equals A inter B. Similarly we write AU BUC, 
AN BNC for sets A, B, C, etc. 


1 Alternatively, a lies in A or a is in A. 
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More generally, let F be a set. The union J F is 
LJF=U{4: 46 F} = {x:2€ A for some AE F} 


is the set whose elements are elements of elements of F. The intersection 
(\F is 
()Fa( {4:46 F} = {2:2 A forall AE F} 


is the set whose elements lie in all the elements of ¥. The cases in the previous 
paragraph correspond to F = {A, B} and F = {A, B,C}. 

The set of all elements in A, but not in B, is denoted A\ B= {xe A: 
x ¢ B} and is called the complement of B in A. For example, when A C B, 
the set A\ B is empty. Often the set A is understood from the context; in 
these cases, A \ B is denoted B¢ and called the complement of B. 

Let A and B be sets. If they have no elements in common, 4M B = @, 
we say they are disjoint. Note (\F C UF for any nonempty set F and 9 is 
disjoint from every set. 

We will have occasion to use De Morgan’s law, 


(AUB)°=A°N Be, (AN B)® = A°UBY, 


or, more generally, 


(UtA: 4 F}) =e: 4eF) 
(MA: 4€F}) =Ul4: Ae Fy. 


The power set of a set X is the set 2% whose elements are the subsets of 
X.If X,Y are sets, their (ordered) product is the set X x Y whose elements 
consist of all ordered pairs (x,y) with « € X and y € Y. Everything in this 
text is either an element or a subset of repeated powers or products of the set 
R of real numbers. 

A relation between two sets X and Y is asubset f C X x Y. A mapping or 
function is a relation f C X x Y, such that, for each x € X, there is exactly 
one y € Y with (x,y) € f. In this case, it is customary to write y = f(a) and 
fi: x7 Y. 

If f: X > Y isa mapping, X is the domain, Y is the codomain, and for 
Ac X, f(A) = {f(x) : a € A} C X is the image of A. In particular, the 
range is f(X). 

A mapping f : X + Y is injective if f(x) = f(a’) implies x = x’, whereas 
f:X > Y is surjective if every element y of Y equals f(x) for some x € X, 
i.e., if the range equals the codomain. A mapping that is both injective and 
surjective is bijective. Alternatively, we say f is an injection, a surjection, 
and a bijection, respectively. 
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If f: X > Y andg: Y > Z are mappings, their composition is the 
mapping go f : X + Z given by (go f)(x) = g(f(z)) for all x € X. In 
general, go f A fog. 

If f: X ~ Y andg: Y > X are mappings, we say they are inverses 
of each other if g(f(x)) = a for all x € X and f(g(y)) = y for ally € Y. 
A mapping f : X > Y is invertible if it has an inverse g. It is a fact that a 
mapping f is invertible iff f is bijective. 

Let F be a set such that all elements A € F are nonempty. A choice 
function is a function f : F > UF satisfying f(A) € A for A € F. Given 
such F, the (unordered) product [| F is the set of all choice functions. 

When X and Y are disjoint, the map f +> (f(X),f(Y)) is a bijection 
between []{X,Y} and X x Y. 

When ¥F is a finite set (§1.3), F = {Aj,...,An}, with Aj,...,An 
nonempty, a choice function corresponds to a choice of an element ay € Az 
for each 1 < & < n. In this case, the nonemptiness of [] F is established in 
Exercise 1.3.24. This is the aziom of finite choice. 

We assume that []F is nonempty whenever F is countable (§1.7), ie., 
given nonempty sets A;, Ag,..., a choice a, € Ax for k > 1 exists. This is 
the axiom of countable choice. 


Exercises 


1.1.1. Show that a mapping f : X > Y is invertible iff it is bijective. 


1.1.2. Let f : X — Y be bijective. Show that the inverse g : Y > X is 
unique. 


1.1.3. Verify De Morgan’s law. 

1.1.4. Show that U{a«} = x for all x and {U2} = iff x is a singleton set. 
1.1.5. Given sets a,b, let c= aUb, d=anb. Show that (e\ a) Ud=b. 
1.1.6. If Ae F, then AC UF and AD ()F. 


1.1.7. Let (a,b) be an ordered pair of sets a,b. Show that Uf)\(a,b) = a, 
UU (a, b) = aUb, (Ula, 6) = ab, and ()(\(a, b) = a. Conclude that b may 
be computed from (a, b). 


1.1.8. A set x is hierarchical if a € x implies a C 2. For example, 0 is 
hierarchical; let S(a) = x U {x}. Show that S(a) is hierarchical whenever 
x is. 


1.1.9. Given sets a,b,c, d, show that (a,b) = (c,d) iffa =c and b=d. 
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1.2 The Set R 


We are ultimately concerned with one and only one set, the set R of real 
numbers. The properties of R that we use are 


e The arithmetic properties, 
e The ordering properties, and 
e The completeness property. 


? 


Throughout, we use “real” to mean “real number,” i.e., an element of R. 

The arithmetic properties start with the fact that reals a, b can be added to 
produce areal a+, the sum of a and b. The rules for addition are a+b = b+a 
and a+ (b+ c) = (a+b) +c, valid for all reals a, b, and c. There is also a 
real 0, called zero, satisfying a+ 0=0+a =a for all reals a, and each real 
a has a negative —a satisfying a + (—a) = 0. As usual, we write subtraction 
a+(—b) asa—b. 

Reals a, b can also be multiplied to produce a real a-b, the product of a and 
b, also written ab. The rules for multiplication are ab = ba, a(bc) = (ab)c, 
valid for all reals a, b, and c. There is also a real 1, called one, satisfying 
al = la = a for all reals a, and each real a 4 0 has a reciprocal 1/a satisfying 
a(1/a) = 1. As usual, we write division a(1/b) as a/b. 

Addition and multiplication are related by the property a(b+c) = ab+ac 
for all reals a, b, and c and the assumption 0 4 1. These are the arithmetic 
properties of the reals. 

Let us show how the arithmetic properties imply there is a unique real 
number 0 satisfying 0-+a = a+0 = a for all a. If 0’ were another real satisfying 
0’+a=a+0' =a for all a, then we would have 0’ = 0+0' = 0'+0=0, 
hence 0 = 0’. Also it follows that there is a unique real playing the role of 
one and 0a = 0 for all a. 

The ordering properties start with the fact that there is subset R* of R, 
the set of positive numbers, that is closed under addition and multiplication, 
ie., ifa,b € R*, then a+b,ab € R™. If a is positive, we write a > 0 or 0 <a, 
and we say a is greater than 0 or 0 is less than a, respectively. Let R~ denote 
the set of negative numbers, i.c., R~ = —R* is the set whose elements are 
the negatives of the elements of R*. The rules for ordering are that the sets 
R-, {0}, R* are pairwise disjoint and their union is all of R. These are the 
ordering properties of R. 

We write a > b and b< ato mean a—b> 0. Then 0 > a iff a is negative 
and a > b implies a+c>6+c. In particular, for any pair of reals a, b, we 
havea <bora=bora>b. 

From the ordering properties, it follows, for example, that 1 > 0, i.e., one 
is positive, a < b and c > 0 imply ac < bc, 0 < a < b implies aa < bb, and 
a<b,b<cimply a <c. As usual, we also write < to mean < or =, > to 
mean > or =, and we say a is nonnegative or nonpositive if a > 0 or a < 0. 

If S is a set of reals, a number MM is an upper bound for S if x < M for all 
x € S. Similarly, m is a lower bound for S ifm < x for all x € S (Figure 1.1). 
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For example, 1 and 1+ 1 are upper bounds for the sets J = {x:0< a < 1} 
and I = {%:0< a < 1}, whereas 0 and —1 are lower bounds for these sets. 
S' is bounded above (below) if it has an upper (lower) bound. S' is bounded if 
it is bounded above and bounded below. 

Not every set of reals has an upper or a lower bound. Indeed, it is easy 
to see that R itself is neither bounded above nor bounded below. A more 
interesting example is the set N of natural numbers (next section): N is not 
bounded above. 


m x M 


Fig. 1.1 Upper and lower bounds for A 


A given set 5 of reals may have several upper bounds. If S has an upper 
bound M such that M < 6 for any other upper bound 6 of S, then we say 
M is a least upper bound or M is a supremum or sup for S, and we write 
M=supS. 

If a is a least upper bound for S and b is an upper bound for S$, then a < b 
since D is an upper bound and a is a least such. Similarly, if b is a least upper 
bound for S and a is an upper bound for S$, then b < a since a is an upper 
bound and 0 is a least such. Hence, if both a and b are least upper bounds, we 
must have a = b. Thus, the sup, whenever it exists, is uniquely determined. 

For example, consider the sets J and J defined above. If M is an upper 
bound for J, then M > x for every x € I, hence M > 1. Thus, 1 is the least 
upper bound for J, or 1 = sup J. The situation with the set J is only slightly 
more subtle: If MW < 1, then c = (1+ M)/2 satisfies M <c<1,soce J, 
hence M cannot be an upper bound for J. Thus, 1 is the least upper bound 
for J, or 1 = sup J. 

A real m that is a lower bound for S and satisfies m > 6 for all other lower 
bounds 6 is called a greatest lower bound or an infimum or inf for S, and we 
write m = inf S. Again the inf, whenever it exists, is uniquely determined. 
As before, it follows easily that 0 = inf J and 0 = inf J. 

The completeness property of R asserts that every nonempty set S Cc R 
that is bounded above has a sup, and every nonempty set S C R that is 
bounded below has an inf. 

We introduce a convenient abbreviation, two symbols oo, —oo, called 
infinity and minus infinity, subject to the ordering rule —co < x < oo for 
all reals x. If a set S is not bounded above, we write sup $ = oo. If S is not 
bounded below, we write inf S = —oo. For example, sup R = o0, inf R = —oo; 
in §1.4 we show that sup N = oo. Recall that the empty set @) is a subset of R. 
Another convenient abbreviation is to write sup = —oo, inf § = oo. Clearly, 
when S is nonempty, inf S < sup S. 
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With this terminology, the completeness property asserts that every subset 
of R, bounded or unbounded, empty or nonempty, has a sup and has an inf; 
these may be reals or -Eoo. 

We emphasize that oo and —oo are not reals but just convenient abbrevia- 
tions. As mentioned above, the ordering properties of ++oo are —00 < x < CO 
for all real x; it is convenient to define the following arithmetic properties of 
too: 


o+o=, 
—0O — © = —O0, 


oo — (—co) =~, 


cotc=o, cER, 
—otc=—-o, cER, 
(+00) + ¢ = +00, c>0, 

00 + 00 = 00,00: (—00) = —00. 


Note that we have not defined oo — co, 0- 00, 00/00, or c/0. 

Let a be an upper bound for a set S. If a € S, we say a is a maximum of 
S, and we write a = max S. For example, with J as above, max] = 1. The 
max of a set S' need not exist; for example, according to the theorem below, 
max J does not exist. 

Similarly, let a be a lower bound for a set S. If a € S, we say ais a 
minimum of S, and we write a = min S. For example, min] = 0 but min J 
does not exist. 


Theorem 1.2.1. Let S C R be a set. The maz of S and the min of S' are 
uniquely determined whenever they exist. The max of S exists iff the sup of 
S lies in S, in which case the max equals the sup. The min of S exists iff the 
inf of S lies in S, in which case the min equals the inf. 


To see this, note that the first statement follows from the second since we 
already know that the sup and the inf are uniquely determined. To establish 
the second statement, suppose that sup S$ € S. Then since sup S is an upper 
bound for S, max S = sup S. Conversely, suppose that max S exists. Then 
sup S < max S' since max S is an upper bound and sup S is the least such. 
On the other hand, sup S is an upper bound for S and max S' € S. Thus, 
max S < supS. Combining sup S < max S and sup S > max S, we obtain 
max S = sup S. For the inf, the derivation is completely analogous. 

Because of this, when max S exists, we say the sup is attained. Thus, the 
sup for J is attained, whereas the sup for J is not. Similarly, when min S 
exists, we say the inf is attained. Thus, the inf for I is attained, whereas the 
inf for J is not. 

Let A, B be subsets of R, let a be real, and let c > 0; let 


-A={-r:rE€ A}, Ata={x+a:x€E A}, cA={cr: re A}, 
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and 

A+B={x+y:xceEA,ye B}. 
Here are some simple consequences of the definitions that must be checked 
at this stage: 


e ACB implies sup A < sup B and inf A > inf B (monotonicity property). 

e sup(—A) = —inf A, inf(—A) = — sup A (reflection property). 

e sup(A +a) = supA+a, inf(A+ a) = inf A+a for a € R (translation 
property). 

e sup(cA) = csup A, inf(cA) = cinf A for c > 0 (dilation property). 

e sup(A+B) = sup A+sup B, inf(A+B) = inf A+inf B (addition property), 
whenever the sum of the sups and the sum of the infs are defined. 


These properties hold whether A and B are bounded or unbounded, empty 
or nonempty. 

We verify the first and the last properties, leaving the others as Exer- 
cise 1.2.7. For the monotonicity property, if A is empty, the property is 
immediate since sup A = —oo and inf A = oo. If A is nonempty anda € A, 
then a € B, hence inf B < a < supB. Thus, sup B and inf B are upper 
and lower bounds for A, respectively. Since sup A and inf A are the least and 
greatest such, we obtain inf B < inf A < sup A < sup B. 

Now we verify sup(A+ B) = sup A+sup B. If A is empty, then so is A+ B; 
in this case, the assertion to be proved reduces to —co + sup B = —oo which 
is true (remember we are excluding the case co — 00). Similarly, if B is empty. 

If A and B are both nonempty, then sup A > «x for alla € A, andsup B > y 
for all y € B, so supA+supB > a+ y for all x € A and y € B. Hence, 
sup A+sup B > z for all z € A+ B, or sup A+ sup B is an upper bound for 
A+B. Since sup(A + B) is the least such, we conclude that sup A + sup B > 
sup(A + B). If sup(A+ B) = ov, then the reverse inequality sup A+ sup B < 
sup(A + B) is immediate, yielding the result. 

If, however, sup(A + B) < co anda € A,y € B, then xr+yeA+B, 
hence « + y < sup(A + B) or, what is the same, « < sup(A + B) — y. 
Thus, sup(A + B) — y is an upper bound for A; since sup A is the least 
such, we get sup A < sup(A + B) — y. Now this last inequality implies, first, 
sup A < oo and, second, y < sup(A + B) — supA for all y € B. Thus, 
sup(A+ B)—sup A is an upper bound for B; since sup B is the least such, we 
conclude that sup B < sup(A+B)-—sup A or, what is the same, sup(A+B) > 
sup A + sup B. Since we already know that sup(A + B) < sup A+ sup B, we 
obtain sup(A + B) = sup A + sup B. 

To verify inf(A + B) = inf A + inf B, use reflection and what we just 
finished to write 


inf(A + B) = —sup[—(A + B)] 
sup[(—A) + (—B)] 
= —sup(—A) — sup(—B) 
= inf A+ inf B. 


This completes the derivation of the addition property. 
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Let X be a set and f : X > Ra function. If A C X, throughout we use 
the notation sup, f and inf, f to mean sup f(A) and inf f(A), respectively. 

It is natural to ask about the existence of R. Where does it come from? 
More precisely, can a set R satisfying the above properties be constructed 
within the context of set theory as sketched in §1.1? The answer is not only 
that this is so but also that such a set is unique in the following sense: If 
S is any set endowed with its own arithmetic, ordering, and completeness 
properties, as above, there is a bijection R — S mapping the properties on R 
to the properties on S. 

Because the construction of R would lead us too far afield and has no 
impact on the content of this text, we skip it. However, perhaps the most 
enlightening construction of R is via Conway’s surreal numbers, with the 
real numbers then being a specific subset of the surreal numbers.” 

For us here, the explicit nature of the elements? of R—the real numbers— 
is immaterial. In summary then, every assertion that follows in this book 
depends only on the arithmetic, ordering, and completeness properties of the 
set R. 


Exercises 


1.2.1. Show that a0 = 0 for all real a. 


1.2.2. Show that there is a unique real playing the role of 1. Also show that 
each real a has a unique negative —a and each nonzero real a has a unique 
reciprocal. 


1.2.3. Show that —(—a) = a and —a = (—1L)a. 


1.2.4. Show that negative times positive is negative, negative times negative 
is positive, and 1 is positive. 


1.2.5. Show that a < bandc € Rimply a+c < b+c,a< bandc> 0 imply 
ac < be, a < band b<cimply a<c, and 0 <a <b implies aa < bb. 


1.2.6. Let a,b > 0. Show that a < 6 iff aa < bb. 


1.2.7. Verify the properties of sup and inf listed above. 


2 See Conway’s book 7. 
3 The elements of R are themselves sets, since (§1.1) everything is a set. 
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1.3 The Subset N and the Principle of Induction 


A subset SC R is inductive if 


A. 1é€S and 
B. S is closed under addition by 1: « € S implies r+1¢€S. 


For example, R* is inductive. The subset N C R of natural numbers or 
naturals is the intersection (§1.1) of all inductive subsets of R, 


N= (\{s : SC R inductive}. 


Then N itself is inductive: Indeed, since 1 € S for every inductive set S, we 
conclude that 1 is in the intersection of all the inductive sets, hence 1 € N. 
Similarly, n € N implies n € S for every inductive set S. Hence, n+1 € S' for 
every inductive set S. Hence, n+ 1 is in the intersection of all the inductive 
sets, hence n + 1 € N. This shows that N is inductive. 

From the definition, we conclude that N C S for any inductive S$ C R. 
For example, since R* is inductive, we conclude that N C R7, ie., every 
natural is positive. 

From the definition, we also conclude that N is the only inductive subset 
of N. For example, S = {1} U(N +1) is a subset of N, since N is inductive. 
Clearly, 1 € S. Moreover, x € S implies x € N implies x + 1 € N +1 implies 
x+1€S,so S is inductive. Hence, S$ = N or {1} U(N+1)=N,ie, n-1 
is a natural for every natural n other than 1. Because of this, n > 1 is often 
used interchangeably with n € N. 

The conclusions above are often paraphrased by saying N is the smallest 
inductive subset of R, and they are so important they deserve a name. 


Theorem 1.3.1 (Principle of Induction). If S Cc R is inductive, then 
SON. IFS CN is inductive, then S=N. 


Let 2=1+1> 1; we show that there are no naturals n between 1 and 2, 
ie., satisfying 1 <n < 2. For this, let S = {1}U{n EN: n> 2}.ThenleS. 
If n € S, there are two possibilities. Either n = 1 or n > 2. If n = 1, then 
n+1=2€S.Ilfn>2,thenn+1>n>2andn+1€EN,son+1 € S. Hence, 
S is inductive. Since S C N, we conclude that S = N. Thus, n > 1 for all 
n €N, and there are no naturals between 1 and 2. Similarly (Exercise 1.3.1), 
for any n € N, there are no naturals between n and n+ 1. 

N is closed under addition and multiplication by any natural. To see this, 
fix a natural n, and let S = {x : «+n € N}, so S is the set of all reals ¢ 
whose sum with n is natural. Then 1 € S sincen+1€N, and z € S implies 
x+neéN implies (x + 1)+n=(a%+n)+1€N implies x +1 € S. Thus, 
S is inductive. Since N is the smallest such set, we conclude that N C S or 
m+n €N for all m € N. Thus, N is closed under addition. This we write 
simply as N+ N CN. Closure under multiplication N-N C N is similar 
and left as an exercise. 
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In the sequel, when we apply the principle of induction, we simply say “by 
induction.” 

To show that a given set S is inductive, one needs to verify A and B. Step 
B is often referred to as the inductive step, even though, strictly speaking, 
induction is both A and B, because, usually, most of the work is in establish- 
ing B. Also the hypothesis in B, x € S, is often referred to as the inductive 
hypothesis. 

Let us give another example of the use of induction. A natural is even if 
it is in 2N = {2n: ne N}. A natural n is odd if n +1 is even. We claim 
that every natural is either even or odd. To see this, let S be the union of the 
set of even naturals and the set of odd naturals. Then 2 = 2-1 is even, so 1 
is odd. Hence, 1 € S. Ifn € S and n = 2k is even, then n + 1 is odd since 
(n+1)+1l=n+4+2=2k+2 =2(k4+1). Hence,n+1leS.IfneS andn 
is odd, then n+ 1 is even, son+1€ S. Hence, in either case, n € S implies 
n+1€S,i.e., S is closed under addition by 1. Thus, S$ is inductive. Hence, 
we conclude that S = N. Thus, every natural is even or odd. Also the usual 
parity rules hold: Even plus even is even, etc. 

Let A be a nonempty set. We say A has n elements and we write #A =n 
if there is a bijection between A and the set {k € N: 1 <k < n}. We 
often denote this last set by {1,2,...,n}. If A =, we say that the number 
of elements of A is zero. A set A is finite if it has n elements for some n. 
Otherwise, A is infinite. Here are some consequences of the definition that 
are worked out in the exercises. If A and B are disjoint and have n and m 
elements, respectively, then AUB has n+m elements. If A is a finite subset 
of R, then max A and min A exist. In particular, we let max(a, b), min(a, b) 
denote the larger and the smaller of a and b. 

Now we show that max A and min A may exist for certain infinite subsets 
of R. 


Theorem 1.3.2. If S CN is nonempty, then min S exists. 


To see this, note that c = inf S is finite since S is bounded below. The 
goal is to establish c € S. Since c+ 1 is not a lower bound, there is an 
née Swithe<n<ct+1.Ifc=n, then c € S and we are done. If c 4 n, 
then n-— 1 < c <n, and n is not a lower bound for S. Hence, there is an 
m € § lying between n — 1 and n. But there are no naturals between n — 1 
and n. 

Two other subsets mentioned frequently are the integers Z = NU {0} U 
(—N) = {0,+1,+2,...} and the rationals Q = {m/n: m,n € Z,n F O}. 
Then Z is closed under subtraction (Exercise 1.3.3), and Q is closed under 
all four arithmetic operations, except under division by zero. As for naturals, 
we say that the integers in 2Z = {2n: n € Z} are even, and we say that an 
integer n is odd if n+ 1 is even. 

A sequence of reals is a function f : N > R. Sequences are usually written 
(an) = (a1, a2,...) where an = f(n), n> 1. 
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Using (an extension of) induction, one can build new sequences from old 
sequences as follows. Let N x R be the product (§1.1) and fix a function 
g:NxR- Rand areal a € R. Then (Exercise 1.3.10) there is a unique 
function f : N > R satisfying f(1) =a and f(n+1) = g(n, f(n)), n > 1, or, 
equivalently, there is a unique sequence (a,,) satisfying ay = a and ay41 = 
g(n, an), n> 1. 

For example, given a sequence (a,,), there is a sequence (s,,) satisfying 
8, = a, and Sn41 = Gn41 + Sn, n> 1. This is usually written 


n 
Sn = 0, +a2++--+an= > a, n>, 
k=1 


and corresponds to the choices a = a; and g(n,xz) = © + an41. Then s,, is 
the nth partial sum, n > 1 (§1.5). 

Fix a natural N € N and suppose we are given a function a: {1,2,..., 
N} — R, ie., we are given reals a), @2,...,ay. Then we can extend the 
definition of a to all of N by setting a, = 0 for n > N, and the Nth partial 
sum 


N 
a, tag++-+an = 0 an 
n=1 
is how one defines the sum of a1, da2,...,ayN. 
Similarly, given a sequence (a,,), there is a sequence (p,,) satisfying py = a1 
and Pn41 = Pn-* An41, 2 > 1. This is usually written 


n 
Pn = 41+ 2°++++ On = [J ag, nm 1, 
k=1 


and corresponds to the choices a = a; and g(n, x) = %- dn41. Then py, is the 
nth partial product (§5.6). For example, if a is a fixed real and a, = a for all 
n > 1, the resulting sequence is (a”). 

When a, = n, n > 1, the resulting sequence of partial products satisfies 
py =1 and pryi = pn(n +1), n > 1; this is the factorial sequence (n!). It is 
convenient to also define 0! = 1. 

Fix a natural N € N and suppose we are given a function a: {1,2,..., 


N} > R, ie., we are given reals a),a@2,...,an. Then we can extend the 
definition of a to all of N by setting a, = 1 for n > N, and the Nth partial 
product 
N 
ay-ag++++-an =|] an 
n=1 


is how one defines the product of a1, a2,...,an. 
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Now (—1)” is 1 or —1 according to whether n € N is even or odd, a > 0 
implies a” > 0 for n € N, and a > 1 implies a” > 1 for n € N. These are 
easily checked by induction. 

If a # 0, we extend the definition of a” to n € Z by setting a? = 1 
and a” = 1/a™ for n € N. Then (Exercise 1.3.11), a"™™ = a"a™ and 
(a")™ = a" for all integers n, m. 

Let a > 1. Then a” = a™ with n,m ©€ Z only when n = m. Indeed, 
n—m€ Z, and a"-™ = a™a-™ = a" /a™ = 1. But a* > 1 for k € N, and 
a* = 1/a-* <1 for k € —N. Hence, n —m = 0 or n = m. This shows that 
powers are unique. 

As another application of induction, we establish, simultaneously, the val- 
idity of the inequalities 1 < 2” and n < 2” for all naturals n. This time, we 
do this without mentioning the set S explicitly, as follows: The inequalities 
in question are true for n = 1 since 1 < 2! = 2. Moreover, if the inequalities 
1 < 2” and n < 2” are true for a particular n (the inductive hypothesis), 
then 1 < 2” < 27 +2" =2"2 = 2"! | 80 the first inequality is true for n+ 1. 
Adding the inequalities valid for n yields n +1 < 2” +2" = 272 = 2"+1 50 
the second inequality is true for n + 1. This establishes the inductive step. 
Hence, by induction, the two inequalities are true for all n € N. Explicitly, 
the set S here is S={neEN:1<2",n< 2"}. 

Using these inequalities, we show that every nonzero n € Z is of the form 
2*p for a uniquely determined k € N U {0} and an odd p € Z. We call k the 
number of factors of 2 inn. 

If 2*p = 23q with k > j and odd integers p, q, then q = 2*-Jp = 2-2*-J-1p 
is even, a contradiction. On the other hand, if 7 > k, then p is even. Hence, 
we must have k = 7. This establishes the uniqueness of k. 

To show the existence of k, by multiplying by a minus, if necessary, we 
may assume n € N. If n is odd, we may take & = 0 and p = n. If n is even, 
then n; = n/2 is a natural < 2"~1. If n, is odd, we take k = 1 and p= 7}. 
If n; is even, then ng = 1/2 is a natural < 2"~?. If nz is odd, we take k = 2 
and p = ng. If nz is even, we continue this procedure by dividing nz by 2. 
Continuing in this manner, we obtain nj1,n2,... naturals with n; < Qr-s, 
Since this procedure ends in fewer than n steps, there is some & natural or 0 
for which p = n/2* is odd. 

The final issue we take up here concerns square roots. Given a real a, a 
square root of a, denoted ,/a, is any real x whose square is a, 2? = a. For 
example, 1 has the square roots +1, and 0 has the square root 0. On the 
other hand, not every real has a square root. For example, —1 does not have 
a square root, i.e., there is no real x satisfying 7 = —1, since x7 +1 > 0. 
In fact a similar argument shows that negative numbers never have square 
roots. 

At this point, we do not know whether 2 has a square root. First, we show 
that 2 cannot be rational. See also Exercises 1.4.12 and 1.4.13. 


Theorem 1.3.3. There is no rational a satisfying a? = 2. 
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We argue by contradiction. Suppose that a = m/n is a rational whose 
square is 2. Then (m/n)? = 2 or m? = 2n?, ie., there is a natural N, such 
that N = m? and N = 2n?. Then m = 2*p with odd p and k € NU {0}, 
so N = m? = 2?*p?. Since p? is odd, we conclude that 2k is the number 
of factors of 2 in N. Similarly n = 2/q with odd q and j € NU {0}, so 
N = 2n? = 223g? = 273+1¢?, Since gq? is odd, we conclude that 27 + 1 is the 
number of factors of 2 in N. Since 2k 4 27 + 1, we arrive at a contradiction. 


Note that Q satisfies the arithmetic and ordering properties. The com- 
pleteness property is all that distinguishes Q and R. 

As usual, in the following, a digit means either 0, 1, 2, or 3 = 2+1, 
4=341,5=441,6=541,7=64+1,8 =741l,or9 =8+1. 
Also the letters n, m, i, 7 will usually denote integers, so n > 1 will be used 
interchangeably with n € N, with similar remarks for m, 2, 7. 

We say that a nonzero n € Z divides m € Z if m/n € Z. Alternatively, we 
say that m is divisible by n, and we write n | m. A natural n is composite if 
n = jk for some j,k € N with j > 1 and k& > 1. A natural is prime if it is 
not composite and is not 1. Thus, a natural is prime if it is not divisible by 
any smaller natural other than 1. 

Let n € Z. A factor of n is an integer p dividing n. We say integers n and m 
have no common factor if the only natural dividing n and m simultaneously 
is l. 

For a > 1, let |a] = max{n € N: n < a} denote the greatest integer <a 
(Exercises 1.3.7 and 1.3.9). Then [a] < a < |a| +1; the fractional part 
of a is {a} = a— |a]. Note that the fractional part is a real in [0,1). More 
generally, |a| € Z and 0 < {a} < 1 are defined’ for alla € R. 


Exercises 


1.3.1. Let n be a natural. Show that there are no naturals between n and 
n+l. 


1.3.2. Show that the product of naturals is natural, N-N CN. 


1.3.3. If m > n are naturals, then m —n € N. Conclude that Z is closed 
under subtraction. 


1.3.4. Show that no integer is both even and odd. Also show that even times 
even is even, even times odd is even, and odd times odd is odd. 


1.3.5. If mn, m are naturals and there is a bijection between {1,2,...,n} and 
{1,2,...,m}, then n = m (use induction on n). Conclude that the number 
of elements #A of a nonempty set A is well defined. Also show that #A = n, 
#B=m,and ANB=9 imply #(AUB)=n+m. 


4 {n € Z:n <a} is nonempty since inf Z = —oo (§1.4). 


14 1 The Set of Real Numbers 


1.3.6. If A C R is finite and nonempty, then show that max A and min A 
exist (use induction). 


1.3.7. If S C Z is nonempty and bounded above, then show that S has a 
max. 


1.3.8. For x > 0 real, let S = {n € N: nz € N}. Show that S is nonempty 
iff « € Q. Show that « = n/d with n,d € N having no common factor iff 
d= min S and n = dz. If we set D(x) = d, show that D(x) = D(a + k) for 
KEN. 


1.3.9.If « > y > 0 are reals, then show that x = yq+r with qd € N, 
r € RU {0}, and r < y. (Look at the sup of {¢ E N: yg < x}.) 


1.3.10. Let X be a set and let g: N x X — X be a mapping and fixa ce X. 
A set f CN x X is inductive if 


A. (la)e f, 
B. (n,x) € f implies (n+ 1,9(n,2)) € f. 


For example, N x X is inductive? for any g and any a. Now let f be the 
intersection of all inductive subsets of N x X 


f= (){h CNxX :h inductive} 


and let A= {n EN: (n,x) € f for some « € X}. Show 


e A=N, 
e f isa mapping (§1.1) with domain N and codomain X, 
e f(1)=aand f(n+1) = g(n, f(n)) for all n > 1. 


Show also that there is a unique such function. (Given n € N, let B, = 
{x : (n,x) © f}. Then f is a mapping iff #B, = 1 for all n. Let 
B={n:#B,, > 1} and use Theorem 1.3.2 to show B is empty.) 


1.3.11. Let a be a nonzero real. By induction, show that a”a™ = a”t™ and 
(a”)™ = a”™ for all integers n, m. 


1.3.12. Using induction, show that 


1 
Lt 2t-tn= Met) n> 1. 
1.3.13. Let p > 1 be a natural. Show that for each nonzero n € Z, there is 
a unique k € NU {0} and an integer m not divisible by p (i.e., m/p is not 


in Z), such that n = p*m. 


5 Note here that inductive depends on g and on a. 
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1.3.14. Let S C R satisfy 


eles 
e né€S whenever k € S for all naturals k < n. 


Show that S DN. This is an alternate form of induction. 


1.3.15. Fix a > 0 real, and let S, = {n € N: na € N}. If S, is nonempty, 
m € Sq, and p= min S,, show that p divides m (Exercise 1.3.9). 


1.3.16. Let n,m be naturals and suppose that a prime p divides the product 
nm. Show that p divides n or m. (Consider a = n/p, and show that min S, = 
1 or min Sy = p.) 


1.3.17. (Fundamental Theorem of Arithmetic) By induction, show that 
every natural n either is 1 or is a product of primes, n = p1...pr, with the 
p;’8 unique except, possibly, for the ordering. (Given n, either n is prime or 
n = pm for some natural 1 < m <n; use induction as in Exercise 1.3.14.) 


1.3.18. Given 0 < x < 1, let ro = x. Define naturals a, and remainders ry, 
by setting 


1 
— = 4n+1 + Tn+1) n > 0. 
lm 


Thus, @n41 = |1/rn]| is the integer part of 1/r, and rn4i = {1/rn} is the 
fractional part of 1/r,, and 


1 
a, + 


: 1 
“+ An—-1 + 


An +Tn 

is a continued fraction.® This algorithm stops the first time r,, = 0. Then the 
continued fraction is finite and we write x = [a1,a2,43,...,@n]. If this never 
happens, this algorithm does not end, and the continued fraction is infinite 


and we write « = [a1,a2,a3,...]. Show that the algorithm stops iff x € Q. 
Computer code generating [a1, a2, a3,...] is in Exercise 5.2.11. 


1.3.19. Let f : N > N be injective, and for n > 1, let A, = {m EN: 
f(m) <n}. Show by induction that A, is bounded above for all n € N. 


1.3.20. Show that 2”-!< nl, n> 1. 


1.3.21. Show that n! <n” < (n!)? for all naturals n. For the second inequal- 
ity, rearrange the factors in (n!)? into pairs. 


1.3.22. Show that (1+ a)" <1+(2"-—1l)a forn > 1 and0<a< 1. Also 
show that (1+ a)" >1+na forn>1anda>-—l. 


6 Because the numerators are all equal to 1, this is a simple continued fraction. 
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1.3.23. Let 0 <a <y< z< 1. Express z,y,z as continued fractions as in 
Exercise 1.3.18: © = [a1,a2,...], y = [b1, b2,...], 2 = [c1,c2,...]. If aj =c;, 
j=l,...,n, then a; = bj = cj, 7 =1,...,n. Use induction on n. 


1.3.24. Let X be a finite set with @ ¢ X. Show there is a function f : 
X — UX such that f(x) € x for « € X. This is the axiom of finite choice 
(use induction on #X). Equivalently, given A,,..., Ap nonempty, there are 
Q1,...,@y with a, € Ay, 1<k <n. 


1.4 The Completeness Property 


We begin by showing that N has no upper bound. Indeed, if N has an upper 
bound, then N has a (finite) sup, and call it c. Then c is an upper bound 
for N, whereas c — 1 is not an upper bound for N, since c is the least such. 
Thus, there is an n > 1, satisfying n > c—1, which gives n+1 > c and 
n+1é€EN. But this contradicts the fact that c is an upper bound. Hence, N 
is not bounded above. In the notation of §1.2, sup N = oo. 

Let S = {1/n: n € N} be the reciprocals of all naturals. Then S' is 
bounded below by 0, hence S has an inf. We show that inf S = 0. First, since 
0 is a lower bound, by definition of inf, inf S > 0. Second, let c > 0. Since 
sup N = oo, there is some natural, call it k, satisfying k > 1/c. Multiplying 
this inequality by the positive c/k, we obtain c > 1/k. Since 1/k is an element 
of S, this shows that c is not a lower bound for S. Thus, any lower bound for 
S must be less or equal to 0. Hence, inf $ = 0. 

The two results just derived are so important we state them again. 


Theorem 1.4.1. sup N = o0, and inf{1/n:n EN} =0. 
As aconsequence, since Z DN, it follows that sup Z = oo. Since Z D (—N) 
and inf(A) = — sup(—A), it follows that inf Z < inf(-N) = —sup N = —oo, 
hence inf Z = —oo. 
An interval is a subset of R of the following form: 


(a,b) = {x@:a<a< bd}, 
[a,b] = {w@:a<a< bd}, 
[a,b) ={w:a<a< bd}, 
(a,b] = {w:a<au< bd}. 


Intervals of the form (a, b), (a, 00), (—00, b), (—00, 00) are open, whereas those 
of the form [a,b], [a,00), (—0o, 6] are closed. When —oo < a < b < ow, the 
interval [a,b] is compact. Thus, (a, co) = {x: x2 > a}, (—oo, b] = {a: x < dD}, 
and (—oo,co) = R. 

For x € R, we define |z|, the absolute value of x, by |x| = max(a,—z). 
Then x < |a| for all x, and, fora > 0, {w:-a<a<a}={a: |2| <a} = 
{a:u<a}n{a:a>-—-a}, {a:a<—-abUf{e:a>a}= {a:|z| >a}. 
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The absolute value satisfies the following properties: 


A. |z| > 0 for all nonzero x, and |0| = 0, 
B. |r| ly| _ |ry| for all TY, 
C. |x+y| < |z| + |y| for all a, y. 


We leave the first two as exercises. The third, the triangle inequality, is 
derived using |2|? = x? as follows: 


jc+y|? =(a@+y)? =a? + 2ay+y? 
< |a|? + 2lxy| + |y|? = |x|? + 2la| |y| + |y|? = (la| + [yl)?- 


Since a < b iff a? < b? for a,b nonnegative (Exercise 1.2.6), the triangle 
inequality is established. 
Frequently, the triangle inequality is used in alternate forms, one of 
which is 
|z —y| = |2| — |yI. 
This follows by writing |a| = |(«— y) + y| < |x — y|+|y| and transposing |y| 
to the other side. Another form is 


Jay tag +---+ap| < lar] + laos] +---+lan|, n> 1. 


We show how the completeness property can be used to derive the existence 
of V2 within R. 


Theorem 1.4.2. There is a real a satisfying a? = 2. 


To see this, let S = {x : 2 > 1 and 2? < 2}. Since 1 € S, S is nonempty. 
Also x € S implies x = #1 < xx = x? < 2, hence S is bounded above by 2, 
hence S' has a sup, call it a. We claim that a? = 2. We establish this claim by 
ruling out the cases a? < 2 and a? > 2, leaving us with the desired conclusion 
(remember every real is positive or negative or zero). 

So suppose that a? < 2. If we find a natural n with (a+1/n)? < 2, then 
a+1/n € S, hence the real a could not have been an upper bound for S, 
much less the least such. To see how to find such an n, note that 


2at+1 


7g Oe NS ag ie 
CS) SOs Pe Sa 
n non noon 
But this last quantity is < 2 if (2a+1)/n < 2—a?, ie., ifn > (2a+1)/(2—a?). 
Since a? < 2, b = (2a + 1)/(2 — a?) > 0; since sup N = oo, such a natural 
n > b can always be found. This rules out a? < 2. 
Before we rule out a? > 2, we note that S is bounded above by any positive 
b satisfying b? > 2 since, for b and & positive, b? > x? iff b> x. 
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Now suppose that a? > 2. Then b = (a? — 2)i7a) is positive, hence there is 
a natural n satisfying 1/n < b which implies a? — 2a/n > 2. Hence, 


so a—1/n is an upper bound for S. This shows that a is not the least upper 
bound, contradicting the definition of a. Thus, we are forced to conclude that 
a? = 2. 
A real a satisfying a? = 2 is called a square root of 2. Since (—2x)? = 2?, 
there are two square roots of 2, one positive and one negative. From now 
on, the positive square root is denoted 2. Similarly, every positive a has a 
positive square root, which we denote ,/a. In the next chapter, after we have 

developed more material, a simpler proof of this fact will be derived. 
More generally, for every b > 0 and n > 1, there is a unique a > 0 satisfying 

” — b, the nth root a = b'/” of b. Now forn >1,k>1, and me Z, 


[omy] = {Temen]"V = omy = om 


hence, by uniqueness of roots, (b™)!/" = (b™*)1/"*k_ Thus, for r = m/n 
rational, we may set b” = (b™)!/", defining rational powers of positive reals. 

Since /2 ¢ Q, R.\ Q is not empty. The reals in R \ Q are the irrationals. 
In fact, both the rationals and the irrationals have an interlacing or density 
property. 


Theorem 1.4.3. If a < b are any two reals, there is a rational s between 
them, a < s <b, and there is an irrational t between them, a<t <b. 


To see this, first, choose a natural n satisfying 1/n < b — a. Second let 
S={meN: na < m}, and let k = inf S = minS. Since k € S, na < k. 
Since k-—1¢S,k—1< na. Hence, s = k/n satisfies 


al 
a<s<a+—<b. 
n 


For the second assertion, choose a natural n satisfying 1/nV/2 < b—a, let 
T ={meEN: v2na < m}, and let k = minT. Since k € T, k > W2na. Since 
k-1¢T,k—1< V2na. Hence, t = k/(nvV2) satisfies 


gn Sgt ey, 


ny/2 


Moreover, ¢ is necessarily irrational. 
Approximation of reals by rationals is discussed further in the exercises. 
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Exercises 


1.4.1. Show that x < |2| for all x and, fora > 0, {@: -a<au<a}s={ar: 
ja) <<a} ={x:a<a}N{a:a>-a}, {a:u<-a}U{ea:a¢>a}={a: 
|x| > a}. 


1.4.2. For all « € R, |z| > 0, |2| > 0 if  F 0, and |2||y| = |ay| for all 
x,yeER. 


1.4.3. By induction, show that |a1 + a2 +--+ + an] < Jai] + |a2| +--+ + lan| 
for n> 1. 


1.4.4. Show that every a > 0 real has a unique positive square root. 


1.4.5. Show that ax? + br +c = 0, a £0, has two, one, or no solutions in R 
according to whether b? — 4ac is positive, zero, or negative. When there are 
solutions, they are given by x = (—b + Vb? — 4ac)/2a. 


1.4.6. For a,b > 0, show that a” > b” iff a > b. Also show that every b > 0 
has a unique positive nth root for all n > 1 (use Exercise 1.3.22 and modify 
the derivation for 2). 


1.4.7. Show that the real t constructed in the derivation of Theorem 1.4.3 is 
irrational. 


1.4.8. Let a be any real. Show that, for each « > 0, no matter how small, 
there are integers n £ 0, m satisfying 


(Let {a} denote the fractional part of a, consider S = {a}, {2a}, {3a}, ..., 
and divide [0,1] into finitely many subintervals of length less than e. Since 
there are infinitely many terms in S, at least two of them must lie in the 
same subinterval.) 
1.4.9. Show that a = V2 satisfies 

| —|> ! 4 

a— —| > ———_., n,m> 1. 

nl ~ (2/24 1)n? 


(Consider the two cases |a — m/n| > 1 and |a— m/n| < 1, separately, and 
look at the minimum of n?|f(m/n)| with f(x) = 2? — 2.) 


1.4.10. Let a= V1+ V2. Then a is irrational, and there is a positive real c 
satisfying 


(Factor f(a) = a* — 2a? — 1 =0, and proceed as in the previous exercise.) 
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1.4.11. For n € Z \ {0}, define |n|z = 1/2* where k is the number of factors 
of 2 in n. Also define |0|2 = 0. For n/m € Q define 


[n/m = |n|2/|mlo. 


Show that |-|2 : Q > R is well defined and satisfies the absolute value 
properties A, B, and C. 


1.4.12. Show that /2 is not rational by showing x = /2 — 1 satisfies « = 
1/(2+ 2), hence 


V2 =14+ ———_ 


(Exercise 1.3.18). 


1.4.13. For n € N let x = /n. Then either « € N or x ¢ Q. This is another 
proof of the irrationality of \/2. (Consider d’ = d(a — |x|) where d = D(z) is 
as in Exercise 1.3.8.) 


1.5 Sequences and Limits 


A sequence of real numbers is a function f : N— R. A finite sequence is a 
function f : {1,...,N}— R for some N > 1. Usually, we write a sequence 
as (d,,) where a, = f(n) is the nth term. For example, the formulas a, = n, 
by = 2n, Cn = 2”, and d, = 2~" + 5n yield sequences (an), (bn), (Cn), and 
(d,,). Later we will consider sequences of sets (Q,,) and sequences of functions 
(fn), but now we discuss only sequences of reals. 

It is important to distinguish between the sequence (an) (the function f) 
and the set {an} (the range f(N) of f). In fact, a sequence is an ordered 
set (@1,@2,a3,...) and not just a set {a1,a2,a3,...}. Sometimes it is more 
convenient to start sequences from the index n = 0, i.e., to consider a se- 
quence as a function on N U {0}. For example, the sequence (1, 2,4,8,...) 
can be written a, = 2",n > 0. Specific examples of sequences are usually 
constructed by induction as in Exercise 1.3.10. However, we will not repeat 
the construction carried out there for each sequence we encounter. 

In this section, we are interested in the behavior of sequences as the index 
n increases without bound. Often this is referred to as the “limiting behavior” 
of sequences. For example, consider the sequences 


(dn) = (1/2, 2/3, 3/4, 4/5,...), 
(bn) = (he 


Tectia 
(Cn) = (2a. \/ W....), 


(dn) = (2,3/2, 17/12, 577/408, ...), 
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where, in the last’ sequence, dj = 2, dz = (di + 2/d1)/2, d3 = (dz + 2/dz)/2, 
d4 = (d3 + 2/d3)/2, and so on. What are the limiting behaviors of these 
sequences? 

As n increases, the terms in (ap) are arranged in increasing order, and 
Gn, <1 for all n > 1. However, if we increase n sufficiently, the terms a, = 
(n — 1)/n = 1—1/n become arbitrarily close to 1, since sup{1 — 1/n: n> 
1} = 1 (81.4). Thus, it seems reasonable to say that (a,) approaches one or 
the limit of the sequence (a,,) equals one. 

On the other hand, the sequence (b,,) does not seem to approach any single 
real, as it flips back and forth between 1 and —1. Indeed, one is tempted to 
say that (bn) has two limits, 1 and —1. 

The third sequence is more subtle. Since we have \/x < x for x > 1, the 
terms are arranged in decreasing order. Because of this, it seems reasonable 
that (cp) approaches its “bottom,” i.e., (c,) approaches L = inf{c, :n > 1}. 
Although, in fact, this turns out to be so, it is not immediately clear just 
what LD equals. 

The limiting behavior of the fourth sequence is not at all clear. If one com- 
putes the first nine terms, it is clear that this sequence approaches something 
quickly. However, since such a computation is approximate, at the outset, we 
cannot be sure there is a single real number that qualifies as “the limit” of 
(dn). The sequence (d,,) is discussed in Exercise 1.5.12 and in Exercise 1.6.4. 

It is important to realize that the questions 


A. What does “limit” mean? 
B. Does the limit exist? 
C. How do we compute the limit? 


are very different. When the situation is sufficiently simple, say, as in (a,,) or 
(b,) above, we may feel that the notion of “limit” is self-evident and needs no 
elaboration. Then we may choose to deal with more complicated situations 
on a case-by-case basis and not worry about a “general” definition of limit. 
Historically, however, mathematicians have run into trouble using this ad hoc 
approach. Because of this, a more systematic approach was adopted in which 
a single definition of “limit” is applied. This approach was so successful that 
it is universally followed today. 

Below, we define the concept of limit in two stages: first, for mono- 
tone sequences and then for general sequences. To deal with situations 
where sequences approach more than one real, the auxiliary concept of a 
“limit point” is introduced in Exercise 1.5.9. Now we turn to the formal 
development. 

Let (an) be any sequence. We say (a,,) is decreasing if dyn > Gn+1 for all 
natural n. If L = inf{a, :n > 1}, in this case, we say (an) approaches L as 
n / oo, and we write a, \, L as n 7 co. Usually, we drop the phrase “as 
n 7 oo” and simply write an \, L. We say a sequence (ay) is increasing if 


7 Decimal notation, e.g., 17 = (9+ 1) + 7, is reviewed in the next section. 
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Qn < Gn41 for all n > 1. If L = sup{a, : n > 1}, in this case, we say (an) 
approaches L as n oo, and we write a, 7 L as n _/ co. Usually, we drop 
the phrase “as n 4 oo” and simply write a, 7 L. Alternatively, in either 
case, we say the limit of (an) is L, and we write 


li = L. 
a 


Note that since sups and infs are uniquely determined, we say “the limit” 
instead of “a limit.” Thus, 
1 
lim (1 - =) =1 
noo n 


since sup{1—1/n:n>1}=1, 


since inf{1/n:n > 1} =0, and 


lim n? =o 
noo 
since sup{n? :n > 1} = oo. 

We say a sequence is monotone if the sequence is either increasing or 
decreasing. Thus, the concept of limit is now defined for every monotone 
sequence. We say a sequence is constant if it is both decreasing and increasing, 
ie., it is of the form (a,a,...) where a is a fixed real. 

If (an) is a monotone sequence approaching a nonzero limit a, then there 
is a natural N beyond which a, 4 0 forn > N. To see this, suppose that (a,,) 
is increasing and a > 0. Then by definition a = sup{a, : n > 1}, hence a/2 is 
not an upper bound for (a,,). Thus, there is a natural N with ay > a/2> 0. 
Since the sequence is increasing, we conclude that a, > ay > 0 forn > N. 
If (a,,) is increasing and a < 0, then a, < a < 0 for all n > 1. If the sequence 
is decreasing, the reasoning is similar. 

Before we define limits for arbitrary sequences, we show that every 
sequence (a,,) lies between a decreasing sequence (a*,) and an increasing se- 
quence (ad, ) in a simple and systematic fashion. 

Let (a,) be any sequence. Let a} = sup{a, : k > 1}, a3 = sup{a, : k > 2}, 
and for each natural n, let a*, = sup{a, : k > n}. Thus, a% is the sup of all 
the terms starting from the nth term. Since {ay :k >n+1} C {ax:k > n} 
and the sup is monotone (§1.2), a%,, < a%. Moreover, it is clear from the 
definition that 


* 


* * 
an = max(@n, Gn41) 2 an+1> n 2 1 


? 
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holds for every n > 1. Thus, (a*) is decreasing and a, < a* since ap € {ax : 
k > n}. Similarly, we set ay, = inf{a, : k > n} for each n > 1. Then (ay) is 
increasing and dn > Gnx. (a*,) is the upper sequence, and (dn) is the lower 
sequence of the sequence (a,,) (Figure 1.2). 

Let us look at the sequence (a*,) more closely and consider the following 


question: When might the sup be attained in the definition of a*? To be 
specific, suppose the sup is attained in a9, i.e., suppose 


* a ca *” a 
Tx UQ* = U3x = V4e = U5x 5 Lh Ly = XQ — 13 
ry r5 tT Te v3 


Fig. 1.2 Upper and lower sequences with z, = re, n > 6 


dg = sup{a, :n > 9} = max{a,: n> 9}. 


This means the set {a, : n > 9} has a greatest element. Then since ag = 
max(ag,a9), it follows that the set {a, : n > 8} has a greatest element or 
that the sup is attained in ag. Continuing in this way, it follows that all the 
suprema in a*, for 1 <n < 9, are attained. We conclude that if the sup is 
attained in a* for some particular n, then the sups are attained in a*, for all 
m <n. Equivalently, if the sup is not attained in a¥ for a particular n, then 
the suprema are not attained for all subsequent terms a>, m > n. 

Now suppose a* > a;,, for a particular n, say ag > ag. Since ag = 
max(ag,a9), this implies ag = ag, which implies the sup is attained in a. 
Equivalently, if the sup is not attained in ag, then neither is it attained in 
ag, @jg, --., and moreover we have ag = ag = Gig =.... 

Summarizing, we conclude: For any sequence (a,,), there isan 1 < N < co 
such that the terms a*, 1 <n < N are maxima, a* = max{a, : k > n}, 
rather than suprema, and the sequence (a3,,@x,,,---) is constant. When 
N = 1, the whole sequence (a*,) is constant, and when N = oo, all terms in 
the sequence (a*) are maxima. 

Let us now return to the main development. 

If the sequence (a,) is any sequence, then the sequences (a*), (Gnx) are 
monotone; hence, they have limits, 


an \ a", nx /* Ax. 
In fact, a, <a*. To see this, fix a natural N > 1. Then 


Nx < Gnx < On < at < ay, n>N. 


But since (a,,,) is increasing, @14, @2x,---, @n» are all < ay., hence 


Anx < an, n> 1. 
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Hence, a3, is an upper bound for the set {an : > 1}. Since a, is the sup of 
this set, we must have a, < aj,. But this is true for every natural N. Since 
a* is the inf of the set {a : N > 1}, we conclude that a, < a’. 


Theorem 1.5.1. Let (a,,) be a sequence, and let 
a* =sup{az,: k > n}, GQnx = inf{a,:k > n}, n> 1. 


Then (ax) and (anx) are decreasing and increasing, respectively. Moreover, if 


a* and ax are their limits, then 


A. Qnx <n < ax, for alln > 1, 
B. ay \ a, 

C. ans Aas, and 

D 


--OO <a, <a* < om. 


A sequence (a,,) is bounded if {a, : k > 1} is a bounded subset of R. 
Otherwise, (a,,) is unbounded. We caution the reader that some of the terms 
ax, Gnx, a8 well as the limits a*, a, may equal too, when (a,,) is unbounded. 
Keeping this possibility in mind, the theorem is correct as it stands. 

If the sequence (a,,) happens to be increasing, then a* = a* and ans = Gn 
for all n > 1. If (a,) happens to be decreasing, then a* = ap and an. = As 
for alln > 1. 

If N is a fixed natural and (a,,) is a sequence, let (ay +n) be the sequence 
(Gn 41,QNn42,---). Then ay 7 ax iffanin 7 ax, and an \, a* iffanin \ a*. 
Also by the sup reflection property (§1.2), b, = —ap for all n > 1 implies 
by = —Anx, On» = —a* for all n > 1. Hence, b* = —ax, b, = —a*. 

Now we define the limit of an arbitrary sequence. Let (a,,) be any sequence, 
and let (a*), (Gnx), a*, a. be the upper and lower sequences together with 
their limits. We call a* the upper limit of the sequence (a,) and a, the lower 
limit of the sequence (a,,). If they are equal, a* = a,, we say that L = a* = a, 
is the limit of (ay), and we write 


lim a,, = L. 

noo 
Alternatively, we say a, approaches L or a, converges to L, and we write 
a, > Lasn “oo or just a, > L. If they are not equal, a* 4 a,, we say 
that (a,,) does not have a limit. 

Since the upper limit is the limit of suprema, and the lower limit is the 

limit of infima, they are also called the limsup and liminf of the sequence, 
and we write 


a* =limsupan, dx = liminf an. 
n—>0o n—+0o 
If (a,) is monotone, let L be its limit as a monotone sequence. Then 
its upper and lower sequences are equal to itself and the constant sequence 
(L,L,...). Thus, its upper limit is Z, and its lower limit is L. Hence L is its 
limit according to the second definition. In other words, the two definitions 
are consistent. 
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Clearly a constant sequence (a,a,a,...) approaches a in any of the above 
senses, as a» = a and a, =a for alln > 1. 
Let us look at an example. Take a, = (—1)"/n, n > 1, or 


1 11 
(an) = (15-54) : 


Then 


is ae 
Pe eg as ee ee ee 
(Qn+) ( 3° 5? 5 ) 


Hence a* = a, = 0; thus, a, — 0. 

Not every sequence has a limit. For example, (1,0,1,0,1,0,...) does not 
have a limit. Indeed, here, a* = 1 and a, =0 for all n > 1, hence a, =0 < 
l=a’. 

Limits of sequences satisfy simple properties. For example, a,, + a implies 
—an > —a, anda, > L iffani, — L. Thus, in a very real sense, the limiting 
behavior of a sequence does not depend on the first NV terms of the sequence, 
for any N > 1. Here is the ordering property for sequences. 


Theorem 1.5.2. Suppose that (an), (bn), and (Cn) are sequences with an < 
bn < Cy for alln > 1. If b, > K and cy, > L, then K < L. If ayn > L and 
Cn > L, then b, > L. 


Note that, in the second assertion, the existence of the limit of (b,) is 
not assumed, but is rather part of the conclusion. Why is this theorem true? 
Well, cj is an upper bound for the set {c, : k > 1}. Since by < cy for all k, 
ci is an upper bound for {by : k > 1}. Since bj is the least such, bj < cj. 
Repeating this argument with k starting at n, instead of at 1, yields b* < c* 
for all n > 1. Repeating the same reasoning again yields b* < c*. If b, — K 
and c, > L, then b* = K and c* = L, so K < L, establishing the first 
assertion. To establish the second, we know that b* < c*. Now set Cyn = —an 
and B, = —b, for alln > 1. Then B, < C;, for all n > 1, so by what we just 
learned, B* < C*. But B* = —b, and C* = —a,, so ax < by. We conclude 
that a, <b, < b* < c*. Ifa, ~ L and c, > L, then a, = LE and c* = OL, 
hence b, = b* = L. 

As an application, 2~” > 0 as n oo since 0 < 27” < 1/n for all n > 1. 


Similarly, lim, 7.0 (+ _ s+) = 0 since 
1 1 ee 1 1 Z 1 
n- nn nen 


for alln > 1 and +1/n>0asn Ao. 
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Let (an) be a sequence with nonnegative terms. Often the ordering prop- 
erty is used to show that a, — 0 by finding a sequence (e,) satisfying 
0<an < ey for alln > 1 ande, 0. 

Below and throughout the text, we will use the following easily checked 
fact: If a and b are reals and a < b+ € for all real € > 0, thena < 6b. 
Indeed, either a < 6 or a > b. If the latter case occurs, we may choose 
€ = (a — b)/2 > 0, yielding the contradiction a = b+ (a—b) > b+. Thus, 
the former case must occur, or a < b. Moreover, if a and b are reals and 
b<a<b+e for alle >0, thena=b. 

Throughout the text, € will denote a positive real number. 


Theorem 1.5.3. If a, > a and b, > b with a, b real, then max(an, bn) > 
max(a,b) and min(ay, b,) > min(a,b). Moreover, for any sequence (an) and 
L real, a, > L iff a, — L > 0 aff |an — L| 3 0. 


Let cn = max(dn, bn), n > 1, c= max(a,b), and let us assume, first, that 
the sequences (an), (bn) are decreasing. Then their limits are their infs, and 
Cn = max(an,bn) > max(a,b) = c. Hence setting c. = inf{cn : n > 1}, 
we conclude that c, > c. On the other hand, given e« > 0, there are n and 
m satisfying an < a+eand bm < b+ €, 80 Crem = MAX(Gnym,bnim) < 
max(dn,bm) < max(a+e,b+¢€) =c+e. Thus, c <c+te. Since € > 0 is 
arbitrary and we already know cx > c, we conclude that c, = c. Since (cy) is 
decreasing, we have shown that cy — c. 

Now assume (ay), (bn) are increasing. Then their limits are their sups, 
and Cn = max(dn, by) < max(a,b) = c. Hence setting c* = sup{cn : n > 1}, 
we conclude that c* < c. On the other hand, given e€ > 0, there are n and 
m satisfying a, > a—e€ and bm > b— €, SO Crntm = MAX(Gn4m,bn4+m) > 
max(@n,bm) > max(a — €,b — €) = c—e. Thus, c* > c—e. Since € > 0 is 
arbitrary and we already know c* < c, we conclude that c* = c. Since (c,) is 
increasing, we have shown that c, - c. 

Now for a general sequence (a»,), we have (a*) decreasing, (an) increasing, 
and 


max(Gnx, One) < Cn < max(a*, bs), n>. 


nrrn 


Thus, (c,,) lies between two sequences converging to c = max(a,b). By the 
ordering property, we conclude that c,, — max(a, b). 

Since min(a,b) = — max(—a,—b), the second assertion follows from the 
first. 

For the third assertion, assume, first, a, — DL, and set b, = ay — L. 
Since sup(A — a) = sup A — a and inf(A — a) = inf A —a, b* = a* — L, and 
One = Gn» — DL. Hence 6* = a*—L = 0, and b, = a,—L = 0. Thus, a, —L — 0. 
If a, — L > 0, then L — a, —> 0. Hence |a,, — L| = max(a, — L, D— an) > 0 
by the first assertion. Conversely, since 


—|ayn — L| <a, —L < la, — L, n> 1, 


la, — L| > 0 implies a, — L — 0, by the ordering property. Since a, = 
(a, — L) + L, this implies a, > L. 
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Often this theorem will be used to show that a, — L by finding a sequence 
(en) satisfying |a, — L| < e, and e, — 0. For example, let A C R be 
bounded above. Then sup A — 1/n is not an upper bound for A; hence, for 
each n > 1, there is areal x, € A satisfying sup A—1/n < x, < sup A, hence 
|v, — sup A| < 1/n. By the above, we conclude that x, — sup A. When A is 
not bounded above, for each n > 1, there is a real x, € A satisfying 7, > n. 
Then xr, — oo = supA. In either case, we conclude, if A C R, there is a 
sequence (z,) C A with x, — sup A. Similarly, if A C R, there is a sequence 
(ayn) C A with x, — inf A. We have shown 


Theorem 1.5.4. [f A is a subset of R, bounded or unbounded, there is a 
sequence (tp) in A converging to sup A, , > sup A, and there is a sequence 
(an) in A converging to inf A, x, > inf A. 


Now we derive the arithmetic properties of limits. 


Theorem 1.5.5. If an, — a and c is real, then can — ca. Let a, b be real. If 
Gdn > G, by > b, then an + bn 2 a+b and anby, — ab. Moreover, if b 4 0, 
then by, #0 for n sufficiently large and an /bp > a/b. 


If c = 0, there is nothing to show. If c > 0, set b, = cay. Since sup(cA) = 
csupA and inf(cA) = cinf A, b*§ = ca*, bnx = Cans, bY = ca*, by = Cay. 
Hence a, — a implies can — ca. Since (—c)an = —(cay), the case with c 
negative follows. 

To derive the additive property, assume, first, that a = b = 0. We have to 
show that a, + 6, — 0. Then 


2min(dn, bn) < Gn + bn <2max(an, bn), n> 1. 


Thus, a, +b, lies between two sequences approaching 0, so a, +b, — 0. For 
general a, b, apply the previous to a}, = a, — a, bi, = b, — b. 

To derive the multiplicative property, first, note that a1. < an < aj, so 
lan| < k for some k, i.e., (4,) is bounded. Use the triangle inequality to get 


|@nbn — ab] = |(an — a)b + (bn — b)an| < |b] |an — @| + Jan] |bn — | 
< |b] jan — a] + kb, — I, n> 1. 


Now the result follows from the additive and ordering properties. 

To obtain the division property, assume b > 0. From the above, a,b—ab;, > 
0. Since b, — 6, by» 7 6, so there exists N > 1 beyond which 6, > by, > 0 
for n > N. Thus, 


Qn a 


bn b 


= la,b — ab,| lanb — ab,,| 


O< = << 
|bn| |b] ~~ bn b 


n>N. 


Thus, |a,/b, — a/b| lies between zero and a sequence approaching zero. The 
case b < 0 is entirely similar. 
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In fact, although we do not derive this, this theorem remains true when 
a or 6 is infinite, as long as we do not allow undefined expressions, such as 
oo — co (the allowable expressions are defined in §1.2). 

As an application of this theorem, 


1 
an? +1 eee JO ae 2+0 
lim > TO lim oe  ?. 
nJo n? —2n+1 n Z70o 2 1 1—2-0+0 
bale 
nr nm 


If (a,) is a sequence with positive terms and b, = 1/an, then a, > 0 
iff b, — oo (Exercise 1.5.3). Now let a > 1 and set b = a— 1. Then a” = 
(1+6)" > 1+nb for all n > 1 (Exercise 1.3.22). Hence a” 7 co. If0 <a <1, 
then a = 1/b with b > 1, so a” = 1/6" \, 0. Summarizing, 


A. Ifa>1, then a” 7 o, 
B. If a=1, then a” = 1 for alln > 1, and 
C. If0<a<1, then a” > 0. 


Sometimes we say that a sequence (a,,) converges to L if a, — L. If 
the specific limit is not relevant, we say that the sequence converges or is 
convergent. If a sequence has no limit, we say it diverges. More precisely, if 
the sequence (an) does not approach L, we say that it diverges from L, and 
we write a, 4 L. From the definition of an > L, we see that a, A L means 
either a* # L or a, # L. This is so whether L is real or oo. 

Typically, divergence is oscillatory behavior, e.g., Gn = (—1)”. Here the 
sequence goes back and forth never settling on anything, not even oo or 
—oo. Nevertheless, this sequence is bounded. Of course a sequence may be 
oscillatory and unbounded, e.g., an = (—1)"n. 

Let (a,) be a sequence, and suppose that 1 < ky < ky < kg < ... 
is an increasing sequence of distinct naturals. Set bn = ax,,n > 1. Then 
(bn) = (ax,,) is a subsequence of (ay). If an + L, then az, > L. Conversely, 
if (@n) is monotone, ax, — L implies an > L (Exercise 1.5.4). 

Generally, if a sequence (x,,) has a subsequence (xx, ) converging to L, we 
say that (ap) subconverges to L. 

Let (a,,) converge to a (finite) real limit L, and let « > 0 be given. Since 
(an«) is increasing to L, there must exist a natural N,, such that a,, > LD—€ 
for n > N,.. Similarly, there must exist N* beyond which we have a}, < L+e. 
Since Qnx < Gn < ay, for all n > 1, we obtain LD—e < a, < L+e for 
n > N = max(N*, N,). Thus, all but finitely many terms of the sequence lie 
in (L —e«,£+€) (Figure 1.3). 

Note that choosing a smaller € > 0 is a more stringent condition on the 
terms. As such, it leads to (in general) a larger N, i.e., the number of terms 
that fall outside the interval (Z — «, L + €) depends on the choice of € > 0. 

Conversely, suppose that L — e€ < a, < L +e for all but finitely many 
terms, for every « > 0. Then for a given € > 0, by the ordering property, 
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DL-—€< a), <a* <L-+e for all but finitely terms. Hence D—e€ < a, <a* < 
L +e. Since this holds for every « > 0, we conclude that a, = a* = LD, ie., 
Gy, — L. We have derived the following: 


Theorem 1.5.6. Let (a) be a sequence and let L be real. If a, > L, then all 
but finitely many terms of the sequence lie within the interval (L —¢«,L +e), 
for alle > 0. Conversely, if all but finitely many terms lie in the interval 
(L—e,L +e), for alle >0, then ay, > L. 


L-e L L+e 


x1 L304 @5 XL x2 


Fig. 1.3 Convergence to L 


From this, we conclude that if a, > L and L £0, then ay #0 for all but 
finitely many n. 
Here is a useful application of the above. 


Theorem 1.5.7. Suppose that f : N > N is injective, i.e., suppose that 
(f(n)) = (an) is a sequence consisting of distinct naturals. Then f(n) + oo. 


To see this, let k, = max{k: f(k) < n} (Exercise 1.3.19). Then k > ky, 
implies f(&) > which implies f,(kn +1) = inf{f(k) :k > k,} > n. Hence 
f(kn +1) A co, as n 7 oo. Since (f,(m)) is monotone and (fx(Kn + 1)) 
is a subsequence of (f,(n)), it follows that f.(n) 7 00, as n 7 oo (Exer- 
cise 1.5.4). Since f(n) > f.(n), we conclude that f(n) > oo. 


Exercises 


1.5.1. Fix N > 1 and (a,). Let (ann) be the sequence (an+1,4N+2,-.-). 
Then a, “7 L iff angn “7 L, and a, \ L iff anzn \y L. Conclude that 
dyn > Diff dnyn 3 LD. 


1.5.2. If a, > L, then —a, > —L. 


1.5.3. If A C R* is nonempty and 1/A = {1/x: x € A}, then inf(1/A) = 
1/sup A, where 1/oo is interpreted here as 0. If (an) is a sequence with 
positive terms and b,, = 1/an, then an — 0 iff b, > oo. 


1.5.4. If a, — L and (ax,,) is a subsequence, then az, — L. If (a,) is 
monotone and az,, > L, then a, > L. 


1.5.5. If a, > L and L £0, then a, 4 0 for all but finitely many n. 


30 1 The Set of Real Numbers 


1.5.6. Let an = /n+1—./n, n> 1. Compute (a*), (ans), a*, and a... Does 
(Gp) converge? 


1.5.7. Let (an) be any sequence with upper and lower limits a* and ax. 
Show that (a,) subconverges to a* and subconverges to ax, i.e., there are 
subsequences (a,,,) and (a,,,) satisfying a,,, + a* and a;, > ax. 


1.5.8. Suppose that (a,,) diverges from L € R. Show that there is an « > 0 
and a subsequence (ax,,) satisfying |ay,, — L| > € for all n > 1. 


1.5.9. Let (x,) be a sequence. If (x,) subconverges to L, we say that L is a 
limit point of (x,). Show that x, and 2* are the least and the greatest limit 
points. 


1.5.10. Show that a sequence (,,) converges iff (ap) has exactly one limit 
point. 


1.5.11. Given f : (a,b) > R, let M = sup{f(x) : a < x < 6b}. Show that 
there is a sequence (2,) with f(x,) — M. (Consider the cases M < co and 
M =o.) 


1.5.12. Define (d,,) by dy = 2 and 


il 2 
dn = dn ZT /> 2 1, 
+1 = 5 ( + z) n 


and set €, = dn — V2, n > 1. By induction, show that e, > 0 for n > 1. Also 
show that 


2 
€ 
nti < —& for alln > 1. 


2/2” 


(First, check that, for any real x > 0, one has (a + 2/a)/2 > V2.) 


1.5.13. Let 0 < a < 1 be irrational, and let (a,) be as in Exercise 1.3.18. 


Let 
1 


1 
1 


Ln = 
ay + 


An—1 + — 

an 
Let x’ and x’, be the unique reals satisfying « = 1/(a,;+2’) and x, = 1/(a1+ 
x! ), respectively. Then 0 < gp,a’,a/, < 1 and a’ and z', are obtained by 


“peeling off the top layer.” Similarly, let 2@) = x" = (2’)’, c@) = (2®)),..., 
a?) = w= (e.) c®) = (a). ... Then of"? = 1/an, n> 1. 


A. Show that |v —2,| = rx, |2' — 2!,|,n > 2. 
B. Iterate A to show that |x — x,| <1/an,n> 1. 
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C. Show that « < (a2 + 1)/(a2 + 2), x’ < (a3 + 1)/(a3 + 2), ete. 
D. If N of the a,’s are bounded by c, iterate A and use C to obtain 


4] n 
la —2,| < € ) ’ 
c+2 


for all but finitely many n. 


Conclude that |x — 2,| > 0 as n 7 co (either ay, — co or an 4 ov). 
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Let (ay) be a sequence of reals. The series formed from the sequence (ap) 
is the sequence (s,) with terms s; = a1, 82 = a; + Gg, and for any n > 1, 
Sn = a, $42 +--++ a, (this sum was defined in §1.3). The sequence (s,,) 
is the sequence of partial sums. The terms a, are called summands, and the 
series is nonnegative if a, > 0 for all n > 1. We often use sigma notation, 
and write 8, = >>), Gx. Series are often written 


a, +ao+.... 


In sigma notation, }> a, or )77~_, dn. If the sequence of partial sums (s,,) has 
a limit DL, then we say the series sums or converges to L, and we write 


Co 
L=a,tag+---= 5 ay. 
n=1 


Then L is the sum of the series. By convention, we do not allow +oo as 
limits for series, only reals. Nevertheless, for nonnegative series, we write 
San = co to mean )> ay, diverges and }> ay, < co to mean )° ay, converges. 
As with sequences, sometimes it is more convenient to start a series from 
n = 0. In this case, we write }77° 9 an. 

Let L = pS Gy, be a convergent series and let s, denote its nth partial 
sum. The nth tail of the series is L — 8, = )>y°.,41 ak. Since the nth tail is 
the difference between the nth partial sum and the sum, we see that the nth 
tail of a convergent series goes to zero: 


Jim yee. (1.6.1) 
k=n+1 


Let a be real. Our first series is the geometric series 


co 
l+ata?+---= 5 a". 
n=0 
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Here the nth partial sum s, = 1+a+---+ a” is computed as follows: 
a8, =a(1t+at+---+a") =a+e7?+---+a™ =s, +a" -1. 


Hence 
1 qrtt 


n — bi 1. 

‘ l-a ie 

Ifa = 1, then s, =n, so 8p, 7 o. If la] < 1, a” > 0, so s, > 1/(1— a). 
Ifa > 1, then a” 7 oo, so the series equals oo and hence diverges. If a < 


—1, then (a”) diverges, so the series diverges. If a = —1, s, equals 0 or 1 
(depending on n), hence diverges; hence, the series diverges. We have shown 
See if Jal <1 

a Er if ja ; 
n=0 


and )>~~_, a” diverges if |a| > 1. 
To study more general series, we need their arithmetic and ordering prop- 
erties. 


Theorem 1.6.1. If S> a, = L and S¢ by, = M, then S7(an+bn) = L+M. If 
Vian = L, cE R, and bp = cay, then > bn = cL = C(D> an). Tf an < bn < Cn 
and S> an, = L =o cn, then Yo bn = L. 


To see the first property, if s,, t,, and r, denote the partial sums of }> ap, 
Yo bp, and > cp, then sp +t, equals the partial sum of }>(an+,). Hence the 
result follows from the corresponding arithmetic property of sequences. For 
the second property, note that t, = cs,. Hence the result follows from the 
corresponding arithmetic property of sequences. The third property follows 
from the ordering property of sequences, since 8, < tn <n, Sn > LD, and 
Tr 2 L. 

Now we describe the comparison test which we use below to obtain the 
decimal expansions of reals. 


Theorem 1.6.2 (Comparison Test). Let S> an, >> bn be nonnegative ser- 
ies with ay < by, for alln > 1. If > by < 00, then Yan < o0. If Van =~, 
then )> by = 00. 


Stated this way, the theorem follows from the ordering property for seq- 
uences and looks too simple to be of any serious use. In fact, we use it 
to express every real as a sequence of naturals. 


Theorem 1.6.3. Let b =9+1. If di,do,... is a sequence of digits (§1.3), 


then - 
So dnb~” 
n=1 


sums to a realx, 0< a <1. Conversely, if 0< a <1, there is a sequence of 
digits d,,d2,..., such that the series sums to x. 
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The first statement follows by comparison, since 


. dpb” < x 9b” 


n=1 n=1 
<9 I< 9 i 
= —b” — b” SS Le 1. 
“»b pa b 1-(/b) 


To establish the second statement, if = 1, we simply take d,, = 9 for all 
n>1.If0<a <1, let d; be the largest integer < rb. Then d, > 0. This 
way we obtain a digit d (since x < 1) satisfying d, < xb < d; + 1. Now set 
x, = cb —d,. Then 0 < a, < 1. Repeating the above process, we obtain a 
digit dz satisfying dy < x,b < dz +1. Substituting yields dz + bd, < b?x < 
dj + bd, + 1 or dgb-? +. d,jb7! < & < dgb-? +. d,b-! + b~?. Continuing in 
this manner yields a sequence of digits (d,,) satisfying 


(>: a) <a< (>: a) +b", n>1. 
k=1 k=1 


Thus, x lies between two sequences converging to the same limit. 
The sequence (dy) is the decimal expansion of the real 0 < a < 1. As 
usual, we write 


t= .dydod3.... 


To extend the decimal notation to any nonnegative real, for each x > 1, 
there is a smallest natural N, such that b>%a < 1. As usual, if b>Ya = 
.didz..., we write 


t= dydy rae dn .dn+idn+2 ey 


the decimal point (.) moved N places. For example, 1 = 1.00... and b = 
10.00.... In fact, x is a natural iff 7 = djdz...dy.00.... Thus, for naturals, 
we drop the decimal point and the trailing zeros, e.g., 1 = 1, b = 10. 

As an illustration, 2 = 10/8 = 1.25 since 7/10 = 1/8 < 1, and y = 1/8 = 
.125 since 1 < 10y < 2 so dj = 1, z = 10y—1 = 10/8 — 1 = 2/8 satisfies 
2<10z <3s0 dg =2,t = 10z—dy = 10*«2/8—2 = 4/8 satisfies 5 = 10t < 6 
so dg = 5 and dy = 0. 

Note that we have two decimal representations of 1, 1 = .99--- =1.00.... 
This is not an accident. In fact, two distinct sequences of digits yield the 
same real in [0,1] under only very special circumstances (Exercise 1.6.2). 

The natural b, the base of the expansion, can be replaced by any 
natural > 1. Then the digits are (0,1,...,b— 1), and we would obtain b-ary 
expansions. In §4.1, we use b = 2 with digits (0,1) leading to binary expan- 
sions and b = 3 with digits (0,1, 2) leading to ternary expansions. In §5.2, we 
mention b = 16 with digits (0,1,...,9,a,b,c,d,e, f) leading to hexadecimal 
expansions. This completes our discussion of decimal expansions. 
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How can one tell if a given series converges by inspecting the individual 
terms? Here is a necessary condition. 


Theorem 1.6.4 (nth-Term Test). If >a, =LE€ER, then a, > 0. 


In particular, it follows that the terms of a series must be bounded: There 
is a real C > 0 with |a,| < C for all n > 1. 

To see this, we know that s, — L, and so s,_; — L. By the triangle 
inequality, 


la@n| = [Sn — Sn—1| 
= |(8n — L) + (L— Sn-1)| 
< |s, — L| + |sn-1 — L| > 0. 


However, a series whose nth term approaches zero need not converge. For 
example, the harmonic series 


So fae eee = 00 
nn 2° 3 ata 


To see this, use comparison as follows, 


yee nett abs ass oe a 
cexrtt Pegg gg Gg eee 
(age as aes Ss a | 
>P1l+i4f-4-4-4 242-454 
7a ae ie a as 

if eee eae _ 

EN Bot Gael D ose 


On the other hand, the nonnegative series 


converges. To see this, check that 2”—! < n! by induction. Thus, 


14S S<1¢ 3S es 
n=1 n=1 


and hence is convergent. Since the third partial sum is s3 = 2.5, we see that 
the sum lies in the interval (2.5, 3]. 
A series is telescoping if it is a sum of differences, i.e., of the form 


D(a = Gn41) = (a1 — a2) + (ag — a3) + (a3 — ay) +... 


In this case, the following is true. 
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Theorem 1.6.5. If (an) is any sequence converging to zero, then the corre- 
sponding telescoping series converges, and its sum is a4. 


This follows since the partial sums are 


Sn = (@1 — a2) + (a2 — ag) +--+ + (Gn — Gn41) = G1 — On41 


and adn+1 — 0. 
As an application, note that 


aay eee 
129 ° 923." Gea 


since 
= il 1 it 
— = — — ——} = 1. 
2aGey = G <i) 


Another application is to establish the convergence of 


yo 1405 <14)5 =1+) = 
2 2 
7? —e = n( 1) = n(n +1) 
Thus, 
1 1 1 


Expressing this sum in terms of familiar quantities is a question of a totally 
different magnitude. Later (§5.6), we will see how this is done. 
More generally, 


= 1 

——— < © 
D Diese 
n=1 


follows in a similar manner, for any N > 1. 


Exercises 


1.6.1. Let 0 < x < 1 be real. Then x = .d; dz... is in Q iff there are n,m > 1, 
such that djin = d; for j > m, ie., the sequence of digits repeats every n 
digits, from the (m+ 1)st digit on. 


1.6.2. Suppose that .djd2--- = .e,e2... are distinct decimal expansions for 
the same real, and let N be the first natural with dy #¢ en. Then either 
dn =en +1, dy = 0, and ex, = 9 for k > N, or en = dy +1, ex = 0, and 
dy = 9 for k > N. Conclude that x > 0 has more than one decimal expansion 
iff 10Nxz € N for some N € NU {0}. 


36 1 The Set of Real Numbers 


1.6.3. Fix N > 1. Show that 


n es 1 
A. rT 
(4) =e" Nari 
1 


1/N _ »1/N 
B. (n+1) n 2 Nate yyw % = 1, and 


C. ya <x D(a n— Gn11) where ay = 1/(n1/%),n > 1. 


n=2 


e 


Conclude that en < oo. (Use Exercises 1.3.22 and 1.4.6 for A.) 


niti/N 


1.6.4. Let (d,,) and (e,) be as in Exercise 1.5.12. By induction, show that 
En+2 < ee. n 2 1. 


This shows that the decimal expansion of dy42 agrees® with that of J/2 to 2” 
places. For example, dg yields \V/2 to at least 128 decimal places. (First, show 
that es < 1/100. Since the point here is to compute the decimal expansion of 
V2, do not use it in your derivation. Use only 1 < V2 < 2 and (V2)? = 2.) 


1.6.5. Let C Cc [0,1] be the set of reals x = .didgd3... whose decimal digits 
dn,n > 1, are zero or odd. Show that (§1.2) C+C = (0, 2]. 


1.7 Signed Series and Cauchy Sequences 


A series is signed if its first term is positive and at least one of its terms is 
negative. A series is alternating if it is of the form 

co 

Pee ume = a1 — a2 +a3-+-+(—-1)""an t+... 

n=1 
with a, positive for all n > 1. Alternating series are particularly tractable, 
but, first, we need a new concept. 

A sequence (not a series!) (a,,) is Cauchy if its terms approach each other, 
ie., if ja — a,| is small when m and k are large. We make this precise by 
defining (a,,) to be Cauchy if there is a positive sequence (e,,) converging to 
zero, such that 


lam — an| < en, for allm,k >n, for alln > 1. 


If a sequence is Cauchy, there are many choices for (e,,). Any such sequence 
(e,) is an error sequence for the Cauchy sequence (a). 


8 This algorithm was known to the Babylonians. 
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It follows from the definition that every Cauchy sequence is bounded, 
lam| < |@m — a1] + |i] < e1 + |ai| for all m > 1. 

It is easy to see that a convergent sequence is Cauchy. Indeed, if (a,) 
converges to L, then b, = |a, — L| > 0, so (S1.5), b¥ + 0. Hence by the 
triangle inequality 


|@m — ax | <|@m — L|+|a,—L|<b* +0", mk>n>1. 


Since 2b* — 0, (2b%,) is an error sequence for (a,,), SO (@,) is Cauchy. 

The following theorem shows that if the terms of a sequence “approach 
each other,” then they “approach something.” To see that this is not a self- 
evident assertion, consider the following example. Let a, be the rational 
given by the first n places in the decimal expansion of 2. Then Jay, — V2| < 
10-", hence ay, — V2, hence (a,) is Cauchy. But, as far as Q is concerned, 
there is no limit, since /2 ¢ Q. In other words, to actually establish the 
existence of the limit, one needs an additional property not enjoyed by Q, 
the completeness property of R. 


Theorem 1.7.1. A Cauchy sequence (ay) is convergent. 


With the notation of §1.5, we need to show that a, = a*. But this follows 
since the sequence is Cauchy. Indeed, let (e,,) be any error sequence. Then 
for alln > 1, m>n, j > n, we have adm — aj < en. For n and 7 fixed, this 
inequality is true for all m > n. Taking the sup over all m > n yields 


an — ay < en 


for all 7 > > 1. Now for n fixed, this inequality is true for all 7 > n. Taking 
the sup over all 7 > n and using sup(—A) = — inf A yields 


0 < ay —dnx < en, n> 1. 


Letting n 7 o yields 0 < a* — a, < 0, hence a* = a,. 

A series }* a, is said to be absolutely convergent if )~ |a,| converges. For 
example, below, we will see that >(—1)"~!/n converges. Since 5>1/n di- 
verges, however, )>(—1)"~'/n does not converge absolutely. A convergent 
series that is not absolutely convergent is conditionally convergent. On the 
other hand, every nonnegative convergent series is absolutely convergent. 

Let >> dn be an absolutely convergent series and let en = )°7~,, |ax|. Since 
€y is the tail of a convergent series, we have e, — 0. Then (en) is an error 
sequence for the sequence (S») of partial sums of S> an, because 


|Sm — Sk] = |@eqi +°°+ +m] < laxzil +--+: +|am| < en, m>k>n. 


Thus, (s,,) is Cauchy; hence, (s,,) is convergent. We have shown 
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Theorem 1.7.2. If )> |an| converges, then S> an converges, and 


[do an] < YT lanl. 


The inequality follows from the triangle inequality. 

A typical application of this result is as follows. 

If (an) ts a sequence of positive reals decreasing to zero and (by) is bounded, 
then 


So (an = an4i)bn (1.7.1) 
n=1 


converges absolutely. Indeed, if |b,| << C, n> 1, is a bound for (b,), then 


co 


3 \(Qn — Gn4i)bn| < Sea = An41)C = Ca, < co 


n=1 n=1 


since the last series is telescoping. 
To extend the scope of this last result, we will need the following elemen- 
tary formula: 


N N-1 
aibi + S~ an (bn — bn—-1) = D> (An — an41)bn + adn. hey 
n=1 


n=2 


This important identity, easily verified by decomposing the sums, is called 
summation by parts. 


Theorem 1.7.3 (Dirichlet Test). [f (a,,) is a positive sequence decreasing 
to zero and (c,) is such that the sequence by = ci + cg +++: +n, n> 1, is 
bounded, then ~~, GnCn converges and 


Co 


3 AnCn = Se = An+1)bn. (1.7.3) 


n=1 


This is an immediate consequence of letting N 7 00 in (1.7.2) since bn — 
bn—1 = Cn for n > 2. An important aspect of the Dirichlet test is that 
the right side of (1.7.3) is, from above, absolutely convergent, whereas the 
left side is often only conditionally convergent. For example, taking a, = 1/n 
and c, = (-1)""!, n > 1, yields (bn) = (1,0,1,0,...). Hence we conclude 
not only that 


1 Die i: 
2 3 4 °° 


converges but also that its sum equals the sum of the absolutely convergent 
series obtained by grouping the terms in pairs. 
Now we can state the situation with alternating series. 
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Theorem 1.7.4 (Leibnitz Test). If (a,) is a positive decreasing sequence 
with an \, 0, then 


a, —adg+a3—a4+... (1.7.4) 


converges to a limit L satisfying 0 < L < ay. If, in addition, (a,) ts strictly 
decreasing, then 0 < L < ay. Moreover, if 8, denotes the nth partial sum, 
n > 1, then the error |L—s,| at the nth stage is no greater than the (n+1)st 
term an4i, with L > 8, or L < 8 according to whether the (n+ 1)st term is 
added or subtracted. 


For example, 
1 1 1 1 


LB=1--=+2-s4+--... 1.7.5 
3 ” 5 7 = 9 ( ) 
converges, and 0 < L < 1. In fact, since sg = 2/3 and s3 = 13/15, 2/3<L< 
13/15. 
In the previous section, estimating the sum of 
1 1 1 1 


to one decimal place involved estimating the entire series. Here the situation 
is markedly different: The absolute error between the sum and the nth partial 
sum is no larger than the next term a,,41. 

To derive the Leibnitz test, clearly, the convergence of (1.7.4) follows by 
taking c, = (—1)"~! and applying the Dirichlet test, as above. Now the 


differences ay, — Gn41, nm = 1,3,5,..., are nonnegative. Grouping the terms 
in (1.7.4) in pairs, we obtain L > 0. Similarly, the differences —a, + dn41, 
n = 2,4,6,..., are nonpositive. Grouping the terms in (1.7.4) in pairs, we 


obtain L < a,. Thus, 0< L < a,. But 
(—1)"(L— Sn) = Qnga — Gata + Ona — Ona t..+, n> 1. 


Repeating the above reasoning, we obtain 0 < (—1)"(Z— sn) < @n4i, which 
implies the rest of the statement. If, in addition, (a,,) is strictly decreasing, 
this reasoning yields 0 < L < ay. 

If aj + a2 + a3 +... is absolutely convergent, its alternating version is 
a, — a2 + a3 —.... For example, the alternating version of 


1 2 
— =14 24424... 
La 
equals 
1 
1l+2 
Clearly, the alternating version is also absolutely convergent and the alter- 
nating version of the alternating version of a series is itself. Note that the 
alternating version of a series }> a need not be an alternating series. This 
happens iff }> a, is nonnegative. 


=l-gr+r77?-.... 
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Our next topic is the rearrangement of series. A series }> Ap is a rear- 
rangement of a series )~ a, if there is a bijection (§1.1) f : NN, such that 
An = y(n) for all n > 1. 


Theorem 1.7.5. If) a, is nonnegative, any rearrangement >. Ay converges 
to the same limit. If \> ap, is absolutely convergent, any rearrangement >> Ay 
converges absolutely to the same limit. 


Nonnegative convergence and absolute convergence are stated separately 
because the case 5> a, = oo is included in nonnegative convergence. 

To see this, first assume )°|an| < 00, and let en = D707", |aj|, m = 1. 
Then (e,) is an error sequence for the Cauchy sequence of partial sums of 
So |an|- Since (§1.5) (f(m)) is a sequence of distinct naturals, f(n) + oo. In 
fact (§1.5), if we let f.(m) = inf{f(k) : k > n}, then f.(n) 7 oo. To show 
that 5°>|A,,| is convergent, it is enough to show that 5>|A,,| is Cauchy. 

To this end, form >k>n, 


|Ax| + |Apai| spe SE |Arn| = larcey| + la r(e+1)| a a la F(m)| 
S ap.cn)| + lap.cerny| + lop. cetay] +77 S ep em) 


which approaches zero, as n 7 oo. Thus, >> |A,| is Cauchy, hence convergent. 
Hence >> A,, is absolutely convergent. 

Now let s,, S, denote the partial sums of }> a, and 5> Ap, respectively. 
Let E, = 7p, |Ax|, 2 > 1. Then (£,,) is an error sequence for the Cauchy 
sequence of partial sums of 5+ |A,|. Now in the difference S;, — sy, there will 
be cancellation, the only terms remaining being of one of two forms, either 
Ax = ag.) with f(k) > n or ag with k = f(j) with 7 > n (this is where 
surjectivity of f is used). Hence in either case, the absolute values of the 
remaining terms in S, — s, are summands in the series e, + Ey, so 


IS, — 8n| <@n+ En > 0, asn 7 o. 


This completes the derivation of the theorem when > |a;,| < oo. When S* an 
is nonnegative and infinite, then any rearrangement must also be so, since 
(by what we just showed!) 5> A, < oo implies 5> a, < oo. This completes 
the derivation of the nonnegative case. 

The situation with conditionally convergent series is strikingly different. 


Theorem 1.7.6. If )° ap, is conditionally convergent and c is any real num- 
ber, then there is a rearrangement Y~ A, converging to c. 


Let (a;,), (a; ) denote the nonnegative and the negative terms in the series 
>> an. Then we must have > a> = oo and }\ a, = —oo. Otherwise, > a, 
would converge absolutely. Moreover, a} + 0 and az — 0 since a, + 0. We 
construct a rearrangement as follows: Take the minimum number of terms 
a* whose sum sj} is greater than c, then take the minimum number of terms 
a, whose sum s; with sf is less than c, then take the minimum number 
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of additional terms a* whose sum s} with sy; is greater than c, then take 
the minimum number of additional terms a> whose sum s3 with sf is less 
than c, etc. Because af > 0, a, > 0, oat = oo, and Soa, = —ov, this 
rearrangement of the terms produces a series converging to c. Of course, if 
c <0, one starts with the negative terms. 

We can use the fact that the sum of a nonnegative or absolutely convergent 
series is unchanged under rearrangements to study series over other sets. For 
example, let N? = Nx N be (§1.1) the set of ordered pairs of naturals (m,n), 
and set 


1 
Ya (1.7.6) 
3 3 
(m.nyen2 a 


What do we mean by such a series? To answer this, we begin with a definition. 

A set A is countable if there is a bijection f : N > A, i.e., the elements of 
A form a sequence. If there is no such f, we say that A is uncountable. Let 
us show that N? is countable: 


(1, 1), (1, 2), (2,1), (1,3), (2, 2), (3, 1), (1, 4), (2,3), (3, 2), (4,1),.... 


Here we are listing the pairs (m,n) according to the sum m+n of their entries. 
It turns out that Q is countable (Exercise 1.7.2), but R is uncountable 
(Exercise 1.7.4). 

Every subset of N is countable or finite (Exercise 1.7.1). Thus, if f : A> 
B is an injection and B is countable, then A is finite or countable. Indeed, 
choosing a bijection g : B > N yields a bijection go f of A with the subset 
(go f)(A) CN. 

Similarly, if f : A > B is a surjection and A is countable, then B is 
countable or finite. To see this, choose a bijection g: N > A. Then fog: 
N > B is a surjection, so we may define h : B — N by setting h(b) equal to 
the least n satisfying f[g(n)] = b, h(b) = min{n EN: f[g(n)] = b}. Then h 
is an injection, and thus, B is finite or countable. 

Let A be a countable set. Given a nonnegative function f : A > R, we 
define the sum of the series over A 


S- f(a) (1.7.7) 


acA 


as the sum of °°, f(an) obtained by taking any bijection of A with N. 
Since the sum of a nonnegative series is unchanged by rearrangement, this is 
well defined. As an exercise, we leave it to be shown that (1.7.6) converges. 

Series Le ) Amn Over N? are called double series over N. A useful ar- 


rangement of a double series follows the order of N? displayed above, 


m,n 


co 


S- Qmn = S- >» aij . (1.7.8) 


(m,n)EN?2 n=1 \itj=n+1 


This is the Cauchy order. 


42 1 The Set of Real Numbers 


Let 5*> @mn be a nonnegative double series. Then we have two correspond- 
ing iterated series 


» & enn] and > (>: enn] (1.7.9) 


n=1 \m=1 


We clarify the meaning of the series on the left, since the series on the right 
is obtained by switching the indices m and n. 

For each n > 1, the series S, = = GQmn 1S a nonnegative real or oo. If 
S, < oo for all n > 1, then S = 5°, S,, is well defined. If S < on, it is 
reasonable to set >>, ()°,, @mn) = S. Otherwise, either S' or at least one of 
the terms S,, is infinite; in this case, it makes sense to set )>,, ()), @mn) = ©. 
With these clarifications, we have the following result. 


Theorem 1.7.7. If a double series Samy is nonnegative, then it equals 
either iterated series: 


oo oo oo co oo 
, Amn = S ) aig | = 5 S Amn | = S S Amn | - 
(m,n)EN?2 k=1 \i+tj=k+1 n=1 \m=1 m=1 \n=1 


(1.7.10) 


Note this result is valid whether }> dmn < 00 or 35 @mn = ©. To see this, 
recall that the first equality is due to the fact that a nonnegative double series 
may be summed in any order. Since the third and fourth sums are similar, it 
is enough to derive the second equality. 

For any natural K, the set Ax C N? of pairs (i,j) with i+j <K+1 
is contained in the set Byn C N? of pairs (m,n) with m < M, n < N, for 
M,N large enough (Figure 1.4). Hence 


K N M fore) foe) 
S{ 3 ae} 23° (Somm) s3 (Sam) 
k=1 i+j=k+1 n=1 \m=1 n=1 \m=1 

Letting K A oo, we obtain 


k=1 \it+j=k+1 


Conversely, for any M,N, Bun C Ax for K large enough, hence 
N /M K oo 
O(Som)jee( Ew) eh ( Oo 
n=1 \m=1 k=1 \itj=k+1 k=1 \itj=k+1 


Letting M 7 oo, 
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fore) N oe) 
i Sale (35 em] 
k=1 \it+j=k+1 n=1 \m=1 
Letting N 7 00, 
[oe) CO CO 
Sales (Soom). 
k=1 \itj=k41 n=1 \m=1 


This yields (1.7.10). 


Fig. 1.4 The sets Byyy and Ax 


To give an application of this, note that since 37 1/n'tV/N 


comparison, so does 


converges, by 


Z(s)= — 
(s) S- a s>1. 
n=2 
(In the next chapter, we will know what n* means for s real. Now think of s 
as rational.) Then (1.7.10) can be used to show that 


y ~— = Z(s) + Z(28) + Z(88) +... (1.7.11) 


ns—1 
n=2 


Let A be a countable set. Given a signed function f : A > R, we say 
(1.7.7) is swmmable® if 


IF (@)| < ©. 


acA 


° The nonnegative/summable separation is analogous to the nonnegative/integrable 
separation appearing in Chapters 4 and 5. 
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In this case, we define (1.7.7) as the sum of >, f(an) obtained by taking 
any bijection of A with N. Since the sum of an absolutely convergent series 
is unchanged by rearrangement, this is well defined. 

Turning to signed series over N?, note that when a double series is 
summable, the series arranged in Cauchy order (1.7.8) is absolutely con- 
vergent, since 


Co co 
ye | & mals S| ais] ] <0 
n=1 jitj=n+1 n=1 \i+j=n4+1 


by (1.7.10). 

We clarify the meaning of the iterated series (1.7.9) in the summable case. 
Applying the nonnegative case to the double series }7(,,, ,) |dmn| < 00, we 
have )>,,, |@mn| < co for all n > 1, and hence S, = 5°,,, mn is a well-defined 
real for all n > 1. Moreover, since |S,| < 5°,,, |@mn|, by the nonnegative case 
again, 


ioe) ioe) oe) 
> |Sn| < >a » om) <0; 
n=1 n=1 \m=1 


hence, 5°, Sn is absolutely convergent to a real S, which we set as the sum 
of the iterated series }7,, (0, mn): 


Theorem 1.7.8. If a double series \* dmn is summable, then it equals either 
iterated series: 


S- Amn = S- ye aij = S- ( enn] _ S- (> nn] : 
(m,n)EN?2 k=1 \itj=k+1 n=1 \m=1 m=1 \n=1 
(1.7.12) 


To see this, recall that the first equality is due to the fact that a summable 
double series may be summed in any order. Since the third and fourth sums 
are similar, it is enough to derive the second equality. 

First, since the iterated series converges absolutely (1.7.10), the tail 


A,= s & lm) >O0as Loo. 


n=L \m=1 


Similarly, and since we know the iterated series are equal in the nonnegative 
case, the tail 


By 2 (> em) = - paz) + 0 as L + 00. 


n=1 \m=L m=L \n=1 


Moreover, m+n > 2D implies m > L or n> L; hence for P > 2L, 
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P 


k=2L \i+j=k+1 m= 


3 

il 

val 
An 


=A,+ Bz; 


lee} 
; Amn 


< > (= enn +S¢ 
letting P — ov, we see the tail 


Co 


CL= 0 S- la;j|] <ALr+Br—70as Lo. 
k=2L \itj=k+1 


But the absolute value of the difference between the series }>, (0, @mn) and 
the finite sum 


is no larger than A; + By, hence vanishes as L — oo. Also, the absolute value 
of the difference between the double series and the finite sum 


2L 


dt de a 


k=1 \itj=k+1 


is no larger than C7, hence vanishes as L — oo. Finally, the absolute value of 
the difference of the finite sums themselves is no larger than Ay, + By + Cr, 
hence vanishes as L > co. 

Above we described double series over N. We could have just as easily 
started the index from zero, describing double series over N U {0}. This is 
important for Taylor series (§3.5), where the index starts from n = 0. 

As an application, we describe the product of two series }> an and S> by. 
We do this for series with the index starting from zero. The product or Cauchy 
product of series }>°° 4 an and S>~° 9 by is the series 


co 


S o- a » aibj | = aobo + (aobi +.41bo) + (aob2 + a1b1 +a2b0) +.... 
n=0 n=0 i+j=n 


Theorem 1.7.9. If a = > 9 an and b = So bn are both nonnegative or 
both absolutely convergent and Cy = ies azbj;, n > 1, then ec = 75 cn 


is nonnegative or absolutely convergent and ab = c. 
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Exercises 


1.7.1. If A C B and B is countable, then A is countable or finite. (If B= N, 
look at the smallest element in A, then the next smallest, and so on.) 
1.7.2. Show that Q is countable. 

1.7.3. If A and B are countable, so is Ax B. Conclude that Qx Q is countable. 


1.7.4. Show that [0,1] and R are uncountable. (Assume [0, 1] is countable. 
List the elements as a), a@2,... . Using the decimal expansions of aj, aa, ... 
construct a decimal expansion not in the list.) 


d 


1.7.5. Show that (1.7.6) converges. 
1.7.6. Derive (1.7.11). 


1.7.7. Let S> a, and )°b, be absolutely convergent. Then the product of 
the alternating versions of >a, and }>b, is the alternating version of the 
product of }* a, and > bp. 


Fig. 1.5 The golden mean x. The ratios of the sides of the rectangles are all x: 1 


1.7.8. Given a sequence (a,) of naturals, let x, be as in Exercise 1.5.13. 
Show that (z,,) is Cauchy, hence convergent to an irrational 2. Thus, contin- 
ued fractions yield a bijection between sequences of naturals and irrationals 
in (0,1). From this point of view, the continued fraction (Figure 1.5) 


1 

ue WED og ae 

2 1 
1+ 


x 
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is special. This real x is the golden mean; x satisfies 


which is reflected in the infinite decreasing sequence of rectangles in Fig- 
ure 1.5. Show in fact this continued fraction does converge to (1 + 5)/2. 


Chapter 2 
Continuity 


2.1 Compactness 


An open interval is a set of reals of the form (a,b) = {x:a< a < b}. Asin 
81.4, we are allowing a = —oo or b= or both. A compact interval is a set 
of reals of the form [a,b] = {x : a < a < b}, where a, b are real. The length 
of [a,b] is b— a. Recall (§1.5) that a sequence subconverges to L if it has a 
subsequence converging to L. 

Recall a subset kK C R is bounded if sup K and inf K are finite. We say 
K is closed if (x,) C K and x, — c implies c € K. For example, from the 
comparison property of sequences, a compact interval is closed and bounded. 


Theorem 2.1.1. Let K C R be closed and bounded and let (xn) be any 
sequence in K. Then (a) subconverges to some c in K. 


To derive this result, since K is bounded, we may choose [a,b] with K C 
[a, b]. Divide the interval I = [a,b] into 10 subintervals (of the same length), 
and order them from left to right (Figure 2.1), Io, h,...,[9. Pick one of 
them, say Ig,, containing infinitely many terms of (x), i.e., {n: @n € Ia, } is 
infinite, and select one of the terms of the sequence in Jy, and call it x',. Then 
the length of J; = Ig, is (b— a)/10. Now divide J; into 10 subintervals again 
ordered left to right and called Ia,9,.-., Za,9- Select’ one of them, say Ia, ds, 
containing infinitely many terms of the sequence, and pick one of the terms 
(beyond x) in the sequence in Ig,a, and call it 74. The length of Jz = Ia, a, 
is (b — a)/100. Continuing by induction, this yields 


IDA DJgD J3ZD... 


1 The choice can be avoided by selecting the leftmost interval at each stage. 
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and a subsequence (z/,), where the length of J, is (b—a)10~” and a, € Jpn 


n 
for alln > 1. But, by construction, the real 


c=a+(b—a)-.didod3... 


lies in all the intervals J,, n > 1 (it may help to momentarily replace [a, b] 
by [0, 1]). Hence, 
|xi, —ce| < (b—a)10™" 5 0. 


Thus (a,,) subconverges to c. Since K is closed, c € K. 


Lay do Ta, 


Fig. 2.1 The intervals Ig, a,...4 


n 


Thus this theorem is equivalent to, more or less, the existence of decimal 
expansions. 

If kK is replaced by an open interval (a,b), the theorem is false as it 
stands; hence, the theorem needs to be modified. A useful modification is 
the following. 


Theorem 2.1.2. If (ap) is a sequence of reals in (a,b), then (ap) subcon- 
verges to somea<c<b or toa or to b. 


To see this, since a = inf(a,b), there is (Theorem 1.5.4) a sequence (c,) 
in (a,b) satisfying cy, — a. Similarly, since b = sup(a, b), there is a sequence 
(dn) in (a,b) satisfying d, — b. Now there is either an m > 1 with (2,) 
in [Cm,dm] or not. If so, the result follows from Theorem 2.1.1. If not, for 
every m > 1, there is an a,,,, not in [Cm, dm]. Let (Ym) be the subsequence of 
(xn,,) obtained by restricting attention to terms satisfying v,,, > dm, and 
let (2m) be the subsequence of (#,,) obtained by restricting attention to 
terms satisfying r,,, < Cm. Then at least one of the sequences (Yn) or (2m) 
is infinite, so either y,, > b or 2m — a (or both) as m > oo. Thus (zp) 
subconverges to a or to b. 

Note this result holds even when a = —oco or b = co. The remainder of 
this section is used only in §6.6 and may be skipped until then. 

We say a set K C R is sequentially compact if every sequence (%,) C K 
subconverges to some c € K. Thus we conclude every closed and bounded set 
is sequentially compact. 

A set U C R is open in R if for every c € U, there is an open interval I 
containing c and contained in U. Clearly, an open interval is an open set. 

A collection of open sets in R is a set U whose elements are open sets 
in R. Then by Exercise 2.1.3, 


Ju = {2:2 €U for some U €U} 


is open in R. 
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Let X be aset. A sequence of sets in X is a function f : N > 2*. This is 
written (A,,) = (A, Ag,...), where f(n) = An, n > 1. If (Ay) is a sequence 
of sets, its union and intersection are denoted 


Wren eer Adaya ( tani, 


Given a sequence of sets (A,,), using Exercise 1.3.10, one can construct 
by induction 


Hi cA Ne, a eae 


k=1 k=1 


for n > 1, by choosing g(n, A) = AU Anyi and g(n, A) = AN Ani. 

Let kK CR be any set. An open cover of K is a collection U of open sets 
in R whose union contains K, kK C UU. If U and UW’ are open covers and 
U CU’, wesayU CU’ isa subcover. If U is countable, we say U is a countable 
open cover. If U is finite, we say U is a finite open cover. 


Theorem 2.1.3. Jf K C R is sequentially compact, then every countable 
open cover has a finite subcover. 


To see this, argue by contradiction. Suppose this was not so, and let 
U = (Uy, Ue,...). 


Then for each n > 1, U,U...UU, does not contain Kk. Since K\U,U...UUn 
is closed and bounded (Exercise 2.1.3), 


tm =inf(K\U,U...UUn,)€ K\ULU...UUy, n> 1. 


Then (x,) C K so (a) subconverges to some c € K. Now select Un with 
c € Uy. Then x, € Un for infinitely many n, contradicting the construction 
of tn, > 1. 

We say K is countably compact if every countable open cover has a finite 
subcover. Thus we conclude every sequentially compact set is countably com- 
pact. 


Theorem 2.1.4. Jf K C R is countably compact, then every open cover has 
a finite subcover. 


To see this, let / be an open cover, and let Z be the collection of open sets 
I such that 


e I is an open interval with rational endpoints, and 
e ICU for some U EU. 
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Then Z is countable and JZ = UU. Thus Z is a countable open cover; hence, 
there is a finite subcover {I),..., Iv} C Z. For each k = 1,..., N, select? an 
open set U = Ux, € U containing I,. Then {Ui,...,Un} C U is a finite 
subcover. 
We say kK C R is compact if every open cover has a finite subcover. Thus 
we conclude every countably compact set is compact. 
We summarize the results of this section. 


Theorem 2.1.5. For K CR, the following are equivalent: 


e K is closed and bounded, 

e K is sequentially compact, 

e K is countably compact, 

e K is compact. 

To complete the proof of this, it remains to show compactness implies 
closed and bounded. So suppose K is compact. Then U = {(—n,n): n > 
1} is an open cover and hence has a finite subcover. Thus K is bounded. 
If (a,) C R converges to c g K, let U, = {x : |x —c| > 1/n}, n > 1, and let 
U = {U,,U2,...}. Then U is an open cover and hence has a finite subcover. 
This implies (a,,) is not wholly contained in K, which implies K is closed. 


In particular, a compact interval [a, b] is a compact set. This is used in §6.6. 


Exercises 


2.1.1. Let (@n,bn), n > 1, be a sequence in R?. We say (an,bn), n > 1, 
subconverges to (a,b) € R? if there is a sequence of naturals (nz) such that 
(dn, ) converges to a and (b,,) converges to b. Show that if (a,,) and (b,) are 
bounded, then (ap, b,) subconverges to some (a, b). 


2.1.2. In the derivation of the first theorem, suppose that the intervals are 
chosen, at each stage, to be the leftmost interval containing infinitely many 
terms. In other words, suppose that Jy, is the leftmost of the intervals I; 
containing infinitely many terms, Jg,a, is the leftmost of the intervals Iq, ; 
containing infinitely many terms, etc. In this case, show that the limiting 
point obtained is z,. 


2.1.3. If Y/ is a collection of open sets in R, then UU is open in R. Also if 
K is closed and U is open, then K \ U is closed. 


? This uses the axiom of finite choice (Exercise 1.3.24). 
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Let (a,b) be an open interval, and let a < c < b. The interval (a, b), punctured 
at c, is the set (a,b) \{c}={x:a<a<b,x Fc}. 

Let f be a function defined on an interval (a, 6) punctured at c,a<c< b. 
We say L is the limit of f at c,, and we write 

lim f(x) = L 

«wc 
or f(x) > Las x — ¢, if, for every sequence (x,,) C (a,b) satisfying x, 4c 
for all n > 1 and converging to c, f(%,) > L. 

For example, let f(x) = x7, and let (a,b) = R. If x, — c, then (§1.5), 
x2 — c?. This holds true no matter what sequence (2) is chosen, as long as 
Xn — c. Hence, in this case, lim, +. f(x) = c?. 

Going back to the general definition, suppose that f is also defined at c. 
Then the value f(c) has no bearing on lim,-4< f(a) (Figure 2.2). For example, 
if f(x) = 0 for « £ 0 and f(0) is defined arbitrarily, then lim,-.o f(x) = 0. 
For a more dramatic example of this phenomenon, see Exercise 2.2.1. 


cio 


See CCt*=C«C 


Fig. 2.2 The value f(c) has no bearing on the limit at c 


Of course, not every function has limits. For example, set f(x) = 1 if 
x € Q and f(x) = 0 if e € R\ Q. Choose any c in (a,b) = R. Since (§1.4) 
there is a rational and an irrational between any two reals, for each n > 1 we 
can find r, € Q andi, € R\ Q withe< rn <ct+1/nandc<in,<c+I1/n. 
Thus r, > c and i, > c, but f(r,) = 1 and f(t,) = 0 for all n > 1. Hence, 
f has no limit anywhere on R. 

Let f be a function defined on an interval (a,b) punctured at c,a<c< b. 
Let (@p) C (a,b) be a sequence satisfying x, 4 c for all n > 1 and converging 
to c. If t, > c, then (f(#p)) may have several limit points (Exercise 1.5.9). 
We say L is a limit point of f at c if for some sequence x, — c, L is a limit 
point of (f(a,)). Then the limit of f at c exists iff all limit points of f at c 
are equal. 

By analogy with sequences, the upper limit of f at c and lower limit of 
f at care® 


L*=inf sup f(z), L,=sup inf f(a). 
6>0 Q<|x-c|<6 : 5>0 0<|x-e|<d 


3 sup, f and inf, f are alternative notations for sup f(A) and inf f(A). 
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Then (Exercise 2.2.8) L* and L, are the greatest and least limit points of f 
at c. 

Let (x,) be a sequence approaching b. If 2, < 6 for all n > 1, we write 
Ln — b—. Let f be defined on (a,b). We say L is the limit of f at b from the 
left, and we write 

lim f(z) =L, 

w—b— 
if t, — b— implies f(a) > L. In this case, we also write f(b—) = L. If 
b = oo, we write, instead, limz.. f(z) = L, f(co) = L, ie., we drop the 
minus. 

Let (a) be a sequence approaching a. If x, > a for all n > 1, we write 
Ln — a+. Let f be defined on (a,b). We say L is the limit of f at a from the 
right, and we write 

a F(a) - a 
if tz, — a+ implies f(z,) > L. In this case, we also write f(a+) = L. 
If a = —o0, we write, instead, lim, f(x) = L, f(—co) = L, i.e., we drop 
the plus. 

Suppose f(b—) = L and (a,,) is a sequence approaching b such that x, < b 
for all but finitely many n > 1. Then we may modify finitely many terms in 
(an) so that x, < 6 for all n > 1; since modifying a finite number of terms 
does not affect convergence, we have f(z,,) + L. Similarly, if f(b+) = L and 
(ay) is a sequence approaching b such that x, > 6 for all but finitely many 
n> 1, we have f(a) > L. 

Of course, L above is either a real or too. 


Theorem 2.2.1. Let f be defined on an interval (a,b) punctured at c, a < 
c<b. Then limy+c f(x) exists and equals L iff f(c+) and f(c—) both exist 
and equal L. 


If lim,_,. f(a) = L, then f(a,) > L for any sequence x, — c, whether the 
sequence is to the right, the left, or neither. Hence, f(c—) = Land f(c+) = L. 

Conversely, suppose that f(c—) = f(c+) = L, and let x, > c with a, 4 ¢ 
for all n > 1. We have to show that f(a) > L. 

Let (yn) denote the terms in (x,) that are greater than c, and let (z,) 
denote the terms in (z,) that are less than c, arranged in their given order. 
If (yn) is finite, then all but finitely many terms of (2,,) are less than c; 
thus, f(an) > L. If (z,) is finite, then all but finitely many terms of (zp) 
are greater than c; thus, f(z,) + L. Hence, we may assume both (y,,) and 
(z,) are infinite sequences with y, > c+ and z, — c—. Since f(c+) = L, it 
follows that f(yYn) > L; since f(c—) = L, it follows that f(z,) — L. 

Let f* and f, denote the upper and lower limits of the sequence (f(x,)), 
and set f* = sup{f(r,):k >n}. Then f* \, f*. Hence, for any subsequence 
(ff), we have ff \. f*. The goal is to show that f* = L = fy. 

Since f(yn) — L, its upper sequence converges to L, sup;s,, f(yi) \v L; 
since f(z,) — L, its upper sequence converges to L, sup;s,, f(z) \y L. 
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For each m > 1, let x, denote the term in (x,,) corresponding to ym, if 
the term y, appears after the term z,,, in (x,). Otherwise, if z,, appears after 
Ym, let xz,, denote the term in (x,,) corresponding to z,,. In other words, if 
Yn = Xj, ANd Zp = Lj,,, Cky = Tmax(in,j,): hus for each n > 1, if k > kn, we 
must have x, equal to y; or z; with 7 > n, so 


{an :k > kn} C{yp st > nbu{ey:i> nh. 


Hence, 


Sk, = sup f(te) < max |sup f(y),supf(a)}, m2. 
k>kn i>n i>n 

Now both sequences on the right are decreasing in n > 1 to L, and the 
sequence on the left decreases to f* asn 7 oo. Thus f* < L. Now let g = —f. 
Since g(c+) = g(c—) = —L, by what we have just learned, we conclude that 
the upper limit of (g(x,)) is < —L. But the upper limit of (g(a»)) equals 
minus the lower limit f, of (f(an)). Hence, f. > L,so f* = f. = L. 

A limit point of f at c is a left limit point of f atc if it is a limit point 
of (f(@n)) for some sequence x, — c—. Similarly, if x, > c+, we have right 
limit points. Every limit point at c is a left limit point at c or a right limit 
point at c. Then f(c+) exists iff all right limit points of f at c are equal, and 
f(c—) exists iff all left limit points of f at c are equal. From the above result, 
the limit of f atc exists iff all left and right limit points of f atc are equal. 

Define L*, to be the greatest of the right limit points of f at c, L.4 the 
least of the right limit points of f at c, L* the greatest of the left limit points 
of f atc, and L,_ the least of the right limit points of f at c. These are the 
upper and lower left and right limits of f at c. We conclude the limit of f at 
c exists iff the four quantities L*,L.4,L%,L,— are equal. 

Since continuous limits are defined in terms of limits of sequences, they 
enjoy the same arithmetic and ordering properties. For example, 


Jim [f(2) + g(2)] = lim f(x) + lim g(2), 
lim (f(x) - g(2)] = lim f(x) - lim g(2). 


These properties will be used without comment. 

A function f is increasing (decreasing) if x < x’ implies f(x) < f(z’) 
(f(x) > f(z’), respectively), for all x,’ in the domain of f. The function 
f is strictly increasing (strictly decreasing) if x < a’ implies f(x) < f(z’) 
(f(a) > f(a’), respectively), for all x, x’ in the domain of f. If f is increasing 
or decreasing, we say f is monotone. If f is strictly increasing or strictly 
decreasing, we say f is strictly monotone. 
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Exercises 
2.2.1. Define f : R > R by setting f(m/n) 


no common factor in m and n > 0, and f(z) 
lim,_.. f(z) = 0 for allc Ee R. 


1/n, for m/n € Q with 
= 0, « ¢ Q. Show that 


2.2.2. Let f be increasing on (a, 6). Then f(a+) (exists and) equals inf{ f(x) : 
a<a <b}, and f(b—) equals sup{f(x):a< a < Dd}. 


2.2.3. If f is monotone on (a,b), then f(c+) and f(c—) exist, and f(c) is bet- 
ween f(c—) and f(c+), for all c € (a,b). Show also that, for each 6 > 0, there 
are, at most, countably many points c € (a,b) where |f(c+) — f(c—)| > 6. 
Conclude that there are, at most, countably many points c in (a,b) at which 


f(ct) # F(e-). 


2.2.4. Let f be defined on [a, b], and let I, = (cx, dx), 1 << k < N, be disjoint 
open intervals in (a,b). The variation of f over these intervals is 


N 
So lf (du) — f (cx) (2.2.1) 


k=1 


and the total variation vy(a,b) is the supremum of variations of f in (a,b) 
over all such disjoint unions of open intervals in (a,b). We say that f is 
bounded variation on [a,b] if v¢(a,b) is finite. Show bounded variation on 
[a, b] implies bounded on [a, 8]. 


2.2.5. If f is increasing on an interval [a,b], then f is bounded variation on 
[a, b] and vy(a,b) = f(b) — f(a). If f = g —h with g, A increasing on {a, d], 
then f is bounded variation on |[a, }]. 


2.2.6. Let f be bounded variation on [a,b], and, for a < x < B, let v(x) = 
vg (a, x). Show 


v(t) +|f@~)-f@)| Sow), asa<ySb, 


hence, v and v — f are increasing on [a,b]. Conclude that f is of bounded 
variation on [a,b] iff f is the difference of two increasing functions on |a, )]. 
If moreover f is continuous, so are v and v — f. 


2.2.7. Show that the f in Exercise 2.2.1 is not bounded variation on [0, 2] 
(remember that 5>1/n = 00). 


2.2.8. Show that the upper limit and lower limit of f at c are the greatest 
and least limit points of f at c, respectively. 
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2.3 Continuous Functions 


Let f be defined on (a,b), and choose a < c < b. We say that f is continuous 
at c if 


lim f(x) = f(C). 


zc 


If f is continuous at every real c in (a,b), then we say that f is continuous 
on (a,b) or, if (a,b) is understood from the context, f is continuous. 

Recalling the definition of limz_,-, we see that f is continuous at c iff, for 
all sequences (x,,) satisfying 7, > c and x, #c,n> 1, f(tn) > f(c). In 
fact, f is continuous at c iff x, > c implies f(z,) > f(c), ie., the condition 
Ln # Cc, n > 1, is superfluous. To see this, suppose that f is continuous 
at c, and suppose that x, — c, but f(a) A f(c). Since f(an) A fc), 
by Exercise 1.5.8, there is an « > 0 and a subsequence (z/,), such that 
|f(ai,) — f(o| > € and 2, > c, for n > 1. But, then f(2/,) 4 f(c) for all 
n > 1; hence, 2, 4c for all n > 1. Since z!, > c, by the continuity at c, we 
obtain f(2/,) > f(c), contradicting |f(a/,) — f(c)| > «. Thus f is continuous 
at c iff tp, — c implies f(an) > f(c). 

In the previous section, we saw that f(x) = «* is continuous at c. Since 
this works for any c, f is continuous. Repeating this argument, one can show 
that f(x) = x4 is continuous, since z* = 2?x?. A simpler example is to choose 
a real k and to set f(x) = & for all x. Here f(a) =k, and f(c) = k for all 
sequences (2,,) and all c, so f is continuous. Another example is f : (0,00) > 
R given by f(a) = 1/a. By the division property of sequences, x, — c implies 
1/ap, — 1/c for c > 0, so f is continuous. 

Functions can be continuous at various points and not continuous at other 
points. For example, the function f in Exercise 2.2.1 is continuous at every 
irrational c and not continuous at every rational c. On the other hand, the 
function f :R — R, given by (§2.2) 


2 


1, xEQ 


ia 
is continuous at no point. 


Continuous functions have very simple arithmetic and ordering properties. 
If f and g are defined on (a,b) and k is real, we have functions f+ 9, kf, fg, 
max(f,g), min(f,g) defined on (a,b) by setting, fora <a <b, 


(f + 9)(@) = f(a) + g(a), 
(kf)(x) = kf(a), 
(f9)(@) = F(@)g(2), 
max(f,g)(7) = max[f (x), 9(x)], 
min(f,g)(%) = min[f(«), g(@)] 
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If g is nonzero on (a,b), i.e., g(x) #0 for alla < a < b, define f/g by setting 


(F/9z) = oy a<a<b. 


Theorem 2.3.1. Jf f and g are continuous, then so are f + 9, kf, fg, 
max(f,g), and min(f,g). Moreover, if g is nonzero, then f/g is continuous. 


This is an immediate consequence of the arithmetic and ordering properties 
of sequences: If a <c< band x, > ¢, then f(a) > f(c) and g(a) > g(c). 
Hence, f(tn) + g(tn) > fle) + gle), kf(@n) > kfc), f(tn)g(tn) > 
F(c)g(e), maxlf(en),g(an)} > max[f(c),g()], and min[f(en),9(an)] 
min[f(c), 9(6)]. I g(c) #0, then f(n)/9(an) > f(C)/9(0). 

For example, we see immediately that f(a) = |2| is continuous on R since 
|x| = max(az, —2). 

Let us prove, by induction, that, for all k > 1, the monomials f,(a) = x 
are continuous (on R). For k = 1, this is so since 2, > c implies fi(a,) = 
In — c= fi(c). Assuming that this is true for k, fri1 = frfi since at! = 
x*x. Hence, the result follows from the arithmetic properties of continuous 
functions. 

A polynomial f : R — R is a linear combination of monomials, i.e., a 
polynomial has the form 


k 


f(z) = agx* ae ayz** ae aga?” +++ + Ag-1X + Aq. 


If ao 4 0, we call d the degree of f. The reals ao, a1,...,a@q are the coefficients 
of the polynomial. 

Let f be a polynomial of degree d > 0, and let a € R. Then there is a 
polynomial g of degree d—1 satisfying* 


f(x) ~ f(a) 


«L—-a 


=g(z), wa. (2.3.1) 


To see this, since every polynomial is a linear combination of monomials, it 
is enough to check (2.3.1) on monomials. But, for f(x) = 2”, 


ge” — qr 


=a tte at---+2a"? +a", x#a, (2.3.2) 
x—a 
which can be checked by cross multiplying. This establishes (2.3.1). 

Since a monomial is continuous and a polynomial is a linear combination 
of monomials, by induction on the degree, we obtain the following. 


Theorem 2.3.2. Every polynomial f is continuous on R. Moreover, if d is 
its degree, there are, at most, d real numbers « satisfying f(a) = 0. 


4 g also depends on a. 
© (2.3.2) with « = 1 was used to sum the geometric series in §1.6. 


2.3 Continuous Functions 59 


A real x satisfying f(x) = 0 is called a zero or a root of f. Thus every 
polynomial f has, at most, d roots. To see this, proceed by induction on the 
degree of f. If d= 1, f(x) = apx + a1, so f has one root « = —a;/ao. Now 
suppose that every dth-degree polynomial has, at most, d roots, and let f 
be a polynomial of degree d+ 1. We have to show that the number of roots 
of f is at most d+ 1. If f has no roots, we are done. Otherwise, let a be a 
root, f(a) = 0. Then by (2.3.1) there is a polynomial g of degree d such that 
f(a) = (w—a)g(a). Thus any root b £ a of f must satisfy g(b) = 0. Since by 
the inductive hypothesis g has, at most, d roots, we see that f has, at most, 
d+ 1 roots. 

A polynomial may have no roots, e.g., f(x) = 2? + 1. However, every 
polynomial of odd degree has at least one root (Exercise 2.3.1). 

A rational function is a quotient f = p/q of two polynomials. The natural 
domain of f is R\ Z(q), where Z(q) denotes the set of roots of g. Since Z(q) 
is a finite set, the natural domain of f is a finite union of open intervals. We 
conclude that every rational function is continuous where it is defined. 

Let f : (a,b) > R. If f is not continuous at c € (a,b), we say that f is 
discontinuous at c. There are “mild” discontinuities, and there are “wild” 
discontinuities. The mildest situation (Figure 2.3) is when the limits f(c+) 
and f(c—) exist and are equal, but not equal to f(c). This can be easily 
remedied by modifying the value of f(c) to equal f(c+) = f(c—). With this 
modification, the resulting function then is continuous at c. Because of this, 
such a point c is called a removable discontinuity. For example, the function 
f in Exercise 2.2.1 has removable discontinuities at every rational. 

The next level of complexity is when f(c+) and f(c—) exist but may 
or may not be equal. In this case, we say that f has a jump discontinuity 
(Figure 2.3) or a mild discontinuity at c. For example, every monotone func- 
tion has (at worst) jump discontinuities. In fact, every function of bounded 
variation has (at worst) jump discontinuities (Exercise 2.3.18). The (amount 
of) jump at c, a real number, is f(c+) — f(c—). In particular, a jump discon- 
tinuity of jump zero is nothing more than a removable discontinuity. 


0 1 2 3 d 


Fig. 2.3 A jump of 1 at each integer 


Any discontinuity that is not a jump is called a wild discontinuity 
(Figure 2.4). If f has a wild discontinuity at c, then from above f cannot 
be of bounded variation on any open interval surrounding c. The converse of 
this statement is false. It is possible for f to have mild discontinuities but 
not be of bounded variation (Exercise 2.2.7). 
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Fig. 2.4 A wild discontinuity 


An alternate and useful description of continuity is in terms of a modulus 
of continuity. Let f : (a,b) > R, and fix a<c< b. For 6 > 0, let 


fc(6) = sup{|f(2) — f(o)| = |e@-— el <d,a< a <b}. 


Since the sup, here, is possibly that of an unbounded set, we may have 
Hte(O) = oo. The function pic : (0,00) + [0,00) U {co} is the modulus of con- 
tinuity of f at c (Figure 2.5). 

For example, let f : (1,10) + R be given by f(x) = x? and pick c = 9. 
Since x? is monotone over any interval not containing zero, the maximum 
value of |x? —81| over any interval not containing zero is obtained by plugging 
in the endpoints. Hence, j19(0) is obtained by plugging in « = 9 + 0, leading 
to u9(d) = 6(6 +18). In fact, this is correct only if0 <6 < 1.If1<06< 8, the 
interval under consideration is (9—6,9+6)N(1, 10) = (9—06, 10). Here plugging 
in the endpoints leads to jug(5) = max(19, 186 —6?). If 6 > 8, then (9—6, 9+6) 
contains (1,10), and hence, ji9(5) = 80. Summarizing, for f(x) = x*, c = 9, 
and (a,b) = (1,10), 


5(d + 18), 0<d<1, 
He(d) = { max(19, 186 — 67), 1<6<8, 
80, 5>8. 


Going back to the general definition, note that .(0) is an increasing func- 
tion of J, and hence, j1-(0+) exists (Exercise 2.2.2). 


Theorem 2.3.3. Let f be defined on (a,b), and choose c € (a,b). The 
following are equivalent. 


A. f is continuous at c. 
B. p-(0+) = 0. 
C. For alle > 0, there exists 6 > 0, such that 


|x —c| <6 implies |f (x) — f(c)| <e. 
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eee 


1 9-6 9 9+6 10 


Fig. 2.5 Computing the modulus of continuity 


That A implies B is left as Exercise 2.3.2. Now assume B, and suppose 
that € > 0 is given. Since y.(0+) = 0, there exists a 6 > 0 with pu,.(6) < «. 
Then by definition of u., |a —c| < 6 implies | f(x) — f(c)| < ue(d) < €, which 
establishes C. Now assume the ¢-d criterion C, and let x, — c. Then for 
all but a finite number of terms, |x, — c| < 6. Hence, for all but a finite 
number of terms, f(c) —€ < f(@n) < f(c) +e. Let yn = f(an), n > 1. By 
the ordering properties of sup and inf, f(c)—€ < yns < y= < f(c) +e. By the 
ordering properties of sequences, f(c) —€ < yx < y* < f(c) +e. Since € > 0 
is arbitrary, y* = yx = f(c). Thus yn = f(a) > f(c). Since (x) was any 
sequence converging to c, lim,-+- f(a) = f(c), ie., A. 

Thus in practice, one needs to compute i-(d) only for 6 small enough, 
since it is the behavior of uw. near zero that counts. For example, to check 
continuity of f(x) = 2? at c = 9, it is enough to note that jug(6) = 6(6 + 18) 
for small enough 6, which clearly approaches zero as 6 > 0+. 

To check the continuity of f(x) = 2? at c = 9 using the e-6 criterion C, 
given e > 0, it is enough to exhibit a 6 > 0 with z9(0) < €. Such ad is the lesser 
of €/20 and 1, 6 = min(e€/20, 1). To see this, first, note that 6(6 +18) < 19 for 
this 6. Then e < 19 implies 6(6+18) < (€/20)(1+18) = (19/20)e < €, whereas 
€ > 19 implies 6(6 +18) < €. Hence, in either case, ju9(6) < €, establishing C. 

Now we turn to the mapping properties of a continuous function. First, 
we define one-sided continuity. Let f be defined on (a, b]. We say that f is 
continuous at b from the left if f(b—) = f(b). In addition, if f is continuous 
on (a,b), we say that f is continuous on (a, b]. Let f be defined on [a, b). We 
say that f is continuous at a from the right if f(a+) = f(a). In addition, if 
f is continuous on (a,b), we say that f is continuous on [a, b). 

Note by Theorem 2.2.1 that a function f is continuous at a particular 
point c iff f is continuous at c from the right and continuous at c from the 
left. 

Let f be defined on [a,b]. We say that f is continuous on [a,b] if f is 
continuous on [a, b) and (a, b]. Checking the definitions, we see f is continuous 
on A if, for every c € A and every sequence (z,) C A converging to c, 
f (an) + f(c), whether A is (a,b), (a, 6], [a, b), or [a, b]. 


Theorem 2.3.4. Let f be continuous on a compact interval [a,b]. Then 
f(a, 0]) 1s @ compact interval [m, M]. 
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Thus a continuous function maps compact intervals to compact intervals. 
Of course, it may not be the case that f(a, b]) equals [ f(a), f(b)]. For exam- 
ple, if f(x) = x, f([-2,2]) = [0,4] and [f(—2), f(2)] = {4}. We derive two 
consequences of this theorem. 

Let f([a, b]) = [m, M]. Then we have two reals c and d in [a,b], such that 
f(c) =m and f(d) = M. In other words, the sup is attained in 


M =sup{f(x):a<a<b}=max{f(r):a<2< b} 
and the inf is attained in 

m =inf{f(z):a<a <b} =min{f(r):a<a< Dd}. 
More succinctly, M is a max and m is a min for the set f([a, d]). 


Theorem 2.3.5. Let f be continuous on [a,b]. Then f achieves its max and 
its min over [a, b}. 


Of course, this is not generally true on noncompact intervals since f(a) = 
1/ax has no max on (0, 1]. 

A second consequence is: Suppose that L is an intermediate value between 
f(a) and f(b). Then there must be ac, a < c < b, satisfying f(c) = L. This 
follows since f(a) and f(b) are two reals in f([a, b]) and f({a, }]) is an interval. 
This is the intermediate value property. 


Theorem 2.3.6 (Intermediate Value Property). Let f be continuous 
on [a,b] and suppose f(a) < L < f(b). Then there is c € (a,b) with f(c) = L. 


On the other hand, the two consequences, the existence of the max and the 
min and the intermediate value property, combine to yield Theorem 2.3.4. 
To see this, let m = f(c) and M = f(d) denote the max and the min, with 
c,d € [a,b]. If m = M, f is constant; hence, f({a, 6]) = [m, M]. Ifm < M and 
m <LI < M, apply the intermediate value property to conclude that there 
is an x between c and d with f(x) = L. Hence, f({a,b]) = [m, M]. Thus to 
derive the theorem, it is enough to derive the two consequences. 

For the first, let M = sup f({a, b]). By Theorem 1.5.4, there is a sequence 
(tp) in [a,b] such that f(an) > M. But Theorem 2.1.1, (a) subconverges 
to some c € [a, b]. By continuity, (f(x,)) subconverges to f(c). Since (f(@n)) 
also converges to M, M = fic), so f has a max. Proceed similarly for the 
min. This establishes Theorem 2.3.5. 

For the second, suppose that f(a) < f(b), and let LZ be an intermediate 
value, f(a) < L < f(b). We proceed as in the construction of 2 in §1.4. 
Let S = {x € [a,b] : f(x) < L}, and let c = supS. S is nonempty since 
a € S, and S is clearly bounded. By Theorem 1.5.4, select a sequence (2,) in 
S' converging to c, &, > c. By continuity, it follows that f(t») > f(c). Since 
f(an) < L for all n > 1, we obtain f(c) < L. On the other hand, c+ 1/n is 
not in S; hence, f(e+1/n) > L. Since c+1/n — c, we obtain f(c) > L. Thus 
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f(c) = L. The case f(a) > f(b) is similar or is established by applying the 
previous to —f. This establishes Theorem 2.3.6 and hence Theorem 2.3.4. 


From this theorem, it follows that a continuous function maps open in- 
tervals to intervals. However, they need not be open. For example, with 
f(x) = 27, f((—2,2)) = [0,4). However, a function that is continuous and 
strictly monotone maps open intervals to open intervals (Exercise 2.3.3). 

The above theorem is the result of compactness mixed with continuity. 
This mixture yields other dividends. Let f : (a,b) > R be given, and fix a 
subset A C (a,b). For 6 > 0, set 


wa(6) = sup{pie(5) : ¢ € A}. 


This is the uniform modulus of continuity of f on A. Since pu-(d) is an in- 
creasing function of 6 for each c € A, it follows that 4(d) is an increasing 
function of 6, and hence 414(0+) exists. We say f is uniformly continuous 
on A if j14(0+) = 0. When A = (a,b) equals the whole domain of the func- 
tion, we delete the subscript A and write (5) for the uniform modulus of 
continuity of f on its domain. 

Whereas continuity is a property pertaining to the behavior of a function 
at (or near) a given point c, uniform continuity is a property pertaining to 
the behavior of f near a given set A. Moreover, since w-(6) < wa(d), uniform 
continuity on A implies continuity at every point c € A. 

Inserting the definition of .(6) in w4(d) yields 


ya(d) = sup{| f(a) — f()| :|a— el <6,a<a <b,ce€ A}, 


where, now, the sup is over both z and c. 

For example, for f(x) = x”, the uniform modulus z4(6) over A = (1,10) 
equals the sup of |x? — y?| over all 1 < x < y < 10 with y—z <6. But this is 
largest when y = x+6; hence, j14(6) is the sup of 6? +226 over 1 < x < 10—6 
which yields .4(6) = 206 — 6°. In fact, this is correct only if 0 < 6 < 9. For 
6 = 9, the sup is already over all of (1,10) and hence cannot get any bigger. 
Hence, j14(6) = 99 for 6 > 9. Summarizing, for f(x) = x? and A = (1,10), 


(6) = 206 — 67, 0<d<9, 
eae aa (9 5>9. 


Since f is uniformly continuous on A if wa(0+) = 0, in practice one 
needs to compute 44(d) only for 6 small enough. For example, to check 
uniform continuity of f(2) = x? over A = (1,10), it is enough to note that 
ua() = 206—6? for small enough 6, which clearly approaches zero as 6 + 0+. 

Now let f : (a,b) + R be continuous, and fix A C (a,b). What additional 
conditions on f are needed to guarantee uniform continuity on A? When A 
is a finite set {c1,...,cw}, 
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LA (0) = max [Bey (6), -++y ben (6)] , 


and hence f is necessarily uniformly continuous on A. 

When A is an infinite set, this need not be so. For example, with f(a) = x 
and B = (0,00), (5) equals the sup of p1.(6) = 2cd + 6? over 0 < c < ~, 
or {4p(0) = on, for each 6 > 0. Hence, f is not uniformly continuous on B. 

It turns out that continuity on a compact interval is sufficient for uniform 
continuity. 


2 


Theorem 2.3.7. If f is continuous on [a,b], then f is uniformly continuous 
on (a,b). Conversely, if f is uniformly continuous on (a,b), then f extends 
to a continuous function on [a, b]. 


To see this, suppose that (0+) = p(a,p)(0+) > 0, and set € = u(0+)/2. 
Since yp is increasing, u(1/n) > 2e, n > 1. Hence, for each n > 1, by the 
definition of the sup in the definition of y(1/n), there is a cp € (a,b) with 
He, (1/n) > e. Now by the definition of the sup in pc, (1/n), for each n > 1, 
there is an v, € (a,b) with |v, — c,| < 1/n and |f(rn) — f(cen)| > €. By 
compactness, (2, ) subconverges to some x € [a, b]. Since |v@,—cp| < 1/n for all 
n > 1, (cn) subconverges to the same x. Hence, by continuity, (| f(@n)—f(en)|) 
subconverges to | f(x) — f(x)| = 0, which contradicts the fact that this last 
sequence is bounded below by e€ > 0. 

Conversely, let f : (a,b) — R be uniformly continuous with modulus of 
continuity jz, and suppose x, — a+. Then (2,,) is Cauchy, so let (e,) be an 
error sequence for (x). Since 


sup [f(ex) — f(@m)| $ ul|te — tml) S len) +0, — n+ 00, 


it follows (f(ap,)) is Cauchy and hence converges. If x}, > a+, 


[f(an) — Fen) < Men -— 2, |) +0, 2 00, 


hence, f(a+) exists. Similarly, f(b—) exists. 

The conclusion may be false if f is continuous on (a, b) but not on [a, }] (see 
Exercise 2.3.23). One way to understand the difference between continuity 
and uniform continuity is as follows. 

Let f be a continuous function defined on an interval (a,b), and pick 
c € (a,b). Then by definition of ic, | f(x) — f(c)| < ue(d) whenever = lies in 
the interval (c — 6,c + 6). Setting g(x) = f(c) for x € (ce — 6,c+ 4), we see 
that, for any error tolerance €, by choosing 6 satisfying p.(0) < €, we obtain 
a constant function g approximating f to within ¢, at least in the interval 
(c—06,c+6). Of course, in general, we do not expect to approximate f closely 
by one and the same constant function over the whole interval (a, b). Instead, 
we use piecewise constant functions. 

If (a,b) is an open interval, a partition of (a,b) is a choice of points a = 
Lo <1 <+++ < &p_1 < Lp = b in (a,b), where we denote the endpoints a 
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and b by xo and 2», respectively (even when they are infinite). We use the 
same notation for compact intervals, i.e., a partition of [a,b] is a partition of 
(a,b) (Figure 2.6). 


a=x20 ey £2 r3 v4 @5 =b 


Fig. 2.6 A partition of (a, b) 


We say g: (a,b) + R is piecewise constant if there is a partition a = 1p < 
Ly < +++ < @, = b, such that g restricted to (%;-1,2;) is constant for i = 
1,...,n (in this definition, the values of g at the points x; are not restricted 
in any way). The mesh 6 of the partition a = 1% < «1 < ++: < &, = b, by 
definition, is the largest length of the subintervals, 6 = maxi<i<n |wi — ®i-1|- 
Note that an interval has partitions of arbitrarily small mesh iff the interval 
is bounded. 

Let f : [a,b] + R be continuous. Then from above, f is uniformly con- 
tinuous on (a,b). Given a partition a = % < a < ++: < & = b with 
mesh 6, choose a® in (a;-1,2;) arbitrarily, 7 = 1,...,n. Then by definition 
of 1, |f(x) — f(xF)| < (6) for x € (a1, xi). If we set g(x) = f(a}) for 
x € (aj-1,%;),7 = 1,...,n, and g(a;) = f(a;), i = 0,1,...,n, we obtain a 
piecewise constant function g : [a,b] + R satisfying | f(x) — g(x)| < u(d) for 
every x € [a,b]. Since f is uniformly continuous, 4(0+) = 0. Hence, for any 
error tolerance « > 0, we can find a mesh 6, such that pu(d) < €. We have 
derived the following (Figure 2.7). 


Theorem 2.3.8. If f is continuous on [a,b], then for each € > 0, there is a 
piecewise constant function f. on [a,b] such that 


ela) = fe) <, as<a<b. 


a=x20 ry © 23 ta =O6 


Fig. 2.7 Piecewise constant approximation 
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If f is continuous on an open interval, this result may be false. For example, 
f(x) = 1/a, 0 < x < 1, cannot be approximated as above by a piecewise 
constant function (unless infinitely many subintervals are used), precisely 
because f “shoots up to co” near 0. 

Let us turn to the continuity of compositions (§1.1). Suppose that f : 
(a,b) > R and g: (c,d) > R are given with the range of f lying in the 
domain of g, f[(a,b)] C (c,d). Then the composition go f : (a,b) > R is 
given by (go f(x) = g[f(a)], a<a <b. 


Theorem 2.3.9. If f and g are continuous, so is go f. 
Since f is continuous, x, — c implies f(a) > f(c). Since g is continuous, 


(9° f)(an) = glf(@n)] > gf (©)] = (9° f)(C)- 


This result can be written as 


lim g[f(x)| = g | lim f(2)| . 


LC [tim 


Since g(x) = |x| is continuous, this implies 


lim |f(2)| = 


lin (2) 


The final issue is the invertibility of continuous functions. Let f : [a,b] > 
m, M] be a continuous function. When is there an inverse (§1.1) g : [m, M] > 
a, b|? If it exists, is the inverse g necessarily continuous? It turns out that the 
answers to these questions are related to the monotonicity properties (§2.2) 
of the continuous function. For example, if f is continuous and increasing on 
a,b] and A C [a,b], sup f(A) = f(sup A), and inf f(A) = f(inf A) (Exer- 
cise 2.3.4). It follows that the upper and lower limits of (f(v,)) are f(x*) 
and f(x), respectively, where x*, x, are the upper and lower limits of (ap) 
Exercise 2.3.5). 


Theorem 2.3.10 (Inverse Function Theorem). Let f be continuous 
on [a,b]. Then f is injective iff f is strictly monotone. In this case, let 
[m, M] = f({a,b]). Then the inverse g : [m,M] — [a,b] is continuous and 
strictly monotone. 


If f is strictly monotone and x # 2’, then x < 2’ or x > x’ which implies 
f(x) < f(x’) or f(x) > f(x’); hence, f is injective. 

Conversely, suppose that f is injective and f(a) < f(b). We claim that 
f is strictly increasing (Figure 2.8). To see this, suppose not and choose 
as«a< a < b with f(x) > f(a’). There are two possibilities: Either 
f(a) < f(x) or f(a) > f(x). In the first case, we can choose L in 


(F(a), F@)) 9 (F(2"), F(a). 
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By the intermediate value property, there are c,d witha <c<a<d< a’ 
with f(c) = L = f(d). Since f is injective, this cannot happen, ruling out 
the first case. In the second case, we must have f(x’) < f(b); hence, x’ < 6, 
so we choose L in 


(F(2'), F(@)) 9 (F(@), F(0)). 


By the intermediate value property, there are c,d witha <c<a'’<d<b 
with f(c) = L = f(d). Since f is injective, this cannot happen, ruling out 
the second case. Thus f is strictly increasing. If f(a) > f(b), applying what 
we just learned to —f yields —f strictly increasing or f strictly decreasing. 
Thus in either case, f is strictly monotone. 


a Cc x d a! b 


Fig. 2.8 Derivation of the IFT when f(a) < f(b) 


Clearly strict monotonicity of f implies that of g. Now assume that f is 
strictly increasing, the case with f strictly decreasing being entirely similar. 
We have to show that g is continuous. Suppose that (y,) C [m,M] with 
Yn > y. Let x = g(y), let tn = g(yn), n > 1, and let «* and x, denote the 
upper and lower limits of (x,,). We have to show g(yn) = tn > & = g(y). 
Since f is continuous and increasing, f(2*) and f(#,) are the upper and lower 
limits of y, = f(a») (Exercise 2.3.5). Hence, f(a*) = y = f(a.). Hence, by 
injectivity, c* =@ = 2x. 

As an application, note that f(x) = x? is strictly increasing on [0,n] and 
hence has an inverse gn(x) = Vx on [0,n?], for each n > 1. By uniqueness 
of inverses (Exercise 1.1.4), the functions gn, n > 1, agree wherever their 
domains overlap, hence yielding a single, continuous, strictly monotone g : 
[0,co) — [0,00) satisfying g(x) = /z, x > 0. Similarly, for each n > 1, 
f(x) = a” is strictly increasing on [0,0o). Thus every positive real x has 
a unique positive nth root x!/”, and, moreover, the function g(x) = 2\/” is 
continuous on [0, 00). By composition, it follows that f(x) = 2”/" = (a™)V/" 
is continuous and strictly monotone on (0,00) for all naturals m,n. Since 
x~* = 1/x* for a € Q, we see that the power functions f(x) = x" are defined, 
strictly increasing, and continuous on (0,00) for all rationals r. Moreover, 
ats = a" x, (x")§ = 2"8 for r,s rational, and, for r > 0 rational, 2” > 0 as 
xz — 0 and 2” > was x > ow. The following limit is important: For x > 0, 


lim al” =], (2.3.3) 
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To derive this, assume x > 1. Then x < val/™ = g(t/™ go g/(m4)) < gl/n, 
so the sequence (x!/") is decreasing and bounded below by 1; hence, its limit 
L > 1 exists. Since L < ¢1/2", L? < 2/2” = ¢1/”. hence, L? < Lor L <1. 
We conclude that L = 1. If0 <# <1, then 1/x >1,so a!/" =1/(1/x)/" > 
lasn A oo. 

Any function that can be obtained from polynomials or rational functions 
by arithmetic operations and/or the taking of roots is called a (constructible) 
algebraic function. For example, 

1 


at cr 0<a<il, 


is an algebraic function. 
We now know what a? means for any a > 0 and b € Q. But what if b ¢ Q? 
What does 2¥? mean? To answer this, fix a > 1 and b> 0, and let 


c=sup{a’:0<r<b,reQ}. 


Let us check that when b is rational, c = a’. Since r < s implies a” < a’, 
a” <a> when r < b. Hence, c < a?. Similarly, ¢ > a’~/" = a®/a/” for all 
n> 1. Let n 7 00 and use (2.3.3) to get ¢ > a’. Hence, c = a? when b is 
rational. Thus it is consistent to define, for any a > 1 and real b > 0, 


a’ =sup{a”:0<r<b,re€ Q}, 


a® = 1,anda~’ = 1/a®. For all b real, we define 1° = 1, whereas for 0 < a < 1, 
we define a? = 1/(1/a)’. This defines a? > 0 for all positive real a and all 
real b. Moreover (Exercise 2.3.7), 


a’ = inf{a*: 5s >b,s € Q}. 


Theorem 2.3.11. a° satisfies the usual rules: 


. Fora>1 and0<b<c real, 1 <a’ <a’. 

. For0<a<1 and0<b<ce real, a’ > a’. 

. For0<a<b andc> 0 real, a°b* = (ab)°, (b/a)* = b¢/a°, and a® < b°. 
. Fora>0 and b, c real, a*+¢ = abate. 


. For a> 0, b,¢ real, ae = (a®)*. 


msoawe 


Since A C B implies sup A < sup B, a’ < a° when a > 1 and b < ec. Since, 
for any b < c, there is an r € QM (b,c), a? < a®, thus the first assertion. 
Since, for 0 <a <1, a’ =1/(1/a)°, applying the first assertion to 1/a yields 
(1/a)® < (1/a)° or a? > a®, yielding the second assertion. For the third, 
assume a > 1. If0 <r <cis in Q, then a” < a and b" < D° yields 


(ab)” =a™b" <a°d®. 
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Taking the sup over r < c yields (ab)° < a°b°. If r < cand s < c are positive 
rationals, let t denote their max. Then 


a"b’ < a‘b' = (ab)! < (ab)°. 


Taking the sup of this last inequality over all 0 < r < c, first, then over all 
0<s<cyields a°b° < (ab)°. Hence, (ab)° = a°b° for b > a > 1. Using this, 
we obtain (b/a)°a° = b° or (b/a)* = b°/a°. Since b/a > 1 implies (b/a)*° > 1, 
we also obtain a° < 6°. The cases a < b < 1 and a < 1 < 6 follow from the 
case b > a > 1. This establishes the third. For the fourth, the case0 <a< 1 
follows from the case a > 1, so assume a > 1,b>0, andc>0.Ifr< band 
8 < care positive rationals, then 


qote > q’ts = aa’. 


Taking the sups over r and s yields a®+¢ > a?a°. If r < b+ c is rational, let 
d= (b+c—r)/3 > 0. Pick rationals t and s with b >t > b—d,c>s>c-—d. 
Then t+ s>b+c—2d> Tr, so 


Taking the sup over all such r, we obtain a?*+¢ < a’a®. This establishes the 
fourth when 6 and c are positive. The cases b < 0 or c < O follow from 
the positive case. The fifth involves approximating b and c by rationals, and 
we leave it to the reader. 

As an application, we define the power function with an irrational expo- 
nent. This is a nonalgebraic or transcendental function. Some of the transcen- 
dental functions in this book are the power function x«* (when a is irrational), 
the exponential function a”, the logarithm log, x, the trigonometric functions 
and their inverses, and the gamma function. The trigonometric functions are 
discussed in §3.5, the gamma function in 85.1, whereas the power, exponen- 
tial, and logarithm functions are discussed below. 


Theorem 2.3.12. Let a be real, and let f(x) = x* on (0,00). Fora > 0, f is 
strictly increasing and continuous with f(0+) = 0 and f(co) = oo. Fora < 0, 
f is strictly decreasing and continuous with f(0+) = 00 and f(co) =0. 


Since x~* = 1/x*%, the second part follows from the first, so assume a > 0. 
Let r, s be positive rationals with r < a < s, and let 7, — c. We have to show 
that «% — c*. But the sequence (x%) lies between (a7) and (x%). Since we 
already know that the rational power function is continuous, we conclude that 
the upper and lower limits L*, L, of (a%) satisfy c” < L, < L* <c*. Taking 
the sup over all r rational and the inf over all s rational, with r < a < s, 
gives L* = L, = c*. Thus f is continuous. Also since 7” — oo as © > oO 
and x” < x for r < a, f(oo) = oo. Since x* < ax* for s > a and 2* > 0 as 
x — 0+, f(0+) =0. 

Now we vary b and fix a in a’. 
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Theorem 2.3.13. Fir a > 1. Then the function f(x) = a”, x € R is strictly 
increasing and continuous. Moreover, 


f(e+s’) = f(a) f(e'), (2.3.4) 


f(—oo) = 0, f(0) =1, and f(c0) = co. 


From Theorem 2.3.11, we know that f is strictly increasing. Since a” 4 co 
as n “7 oo, f(oo) = oo. Since f(—x) = 1/f(x), f(—co) = 0. Continuity 
remains to be shown. If tn \, c, then (a*”) is decreasing and a*” > a®, so its 
limit DL is > a°. On the other hand, for d > 0, the sequence is eventually below 
ac+4 — q°at; hence, L < a°a*. Choosing d = 1/n, we obtain a° < L < a®a'/”. 
Let n “ oo to get DL = a®. Thus, a®” \, a®. If a, 4 c+ is not necessarily 
decreasing, then x* \, c; hence, at» — a°. But x > Xp for all n > 1; hence, 
atn > qt > a°, so a*” — a®°. Proceed similarly from the left. 

The function f(x) = a® is the exponential function with base a > 1. In 
fact, the exponential is the unique continuous function f on R satisfying the 
functional equation (2.3.4) and f(1) =a. 

By the inverse function theorem, f has an inverse g on any compact interval 
and hence on R. We call g the logarithm with base a > 1 and write g(x) = 
log, z. By definition of inverse, a!°%«* = x, for z > 0, and log, (a”) = 2, for 
x €R. The following is an immediate consequence of the above. 


Theorem 2.3.14. The inverse of the exponential f(x) = a™ with base a> 1 
is the logarithm with base a > 1, g(x) = log, x. The logarithm is continuous 
and strictly increasing on (0,00). The domain of log, is (0,00), the range is 
R, log, (0+) = —oo, log, 1 = 0, log, 00 =o, and 


log, (bc) = log, b + log, c, log, (b°) = clog, b, 


forb>0,¢e>0. 


Exercises 


2.3.1. If f is a polynomial of odd degree, then f (too) = too or f(+o00) = 
oo, and there is at least one real c with f(c) = 0. 


2.3.2. If f is continuous at c, then® (0+) = 0. 


2.3.3. If f : (a,b) > Ris continuous, then f((a, b)) is an interval. In addition, 
if f is strictly monotone, f((a,b)) is an open interval. 


6 This uses the axiom of countable choice. 
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2.3.4. If f is continuous and increasing on [a,b] and A C [a,b], then 
sup f(A) = f(sup A) and inf f(A) = f(inf A). 


2.3.5. With f as in Exercise 2.3.4, let «* and x, be the upper and lower 
limits of a sequence (#,). Then f(x*) and f(a.) are the upper and lower 
limits of (f(an)). 


2.3.6. With r,s € Q and x > 0, show that (2")® = x78 and a8 = a2’. 
2.3.7. Show that a? = inf{a®: s > b,s € Q}. 
2.3.8. With b and ¢ real and a > 0, show that (a?)° = a”. 


2.3.9. Fix a > 0. If f : R — R is continuous, f(1) = a, and f(¢ +2’) = 
f(x) f(a’) for 2,2’ ER, then f(x) =a". 


2.3.10. Use the €-d criterion to show that f(x) = 1/z is continuous at x = 1. 
2.3.11. A real x is algebraic if x is a root of a polynomial of degree d > 1, 
aout + act! +++++ag-12 + aq = 0, 


with rational coefficients ao,a,,...,aq. A real is transcendental if it is not 
algebraic. For example, every rational is algebraic. Show that the set of al- 
gebraic numbers is countable (§1.7). Conclude that the set of transcendental 
numbers is uncountable. 


2.3.12. Let a be an algebraic number. If f(a) = 0 for some polynomial f 
with rational coefficients, but g(a) 4 0 for any polynomial g with rational 
coefficients of lesser degree, then f is a minimal polynomial for a, and the 
degree of f is the algebraic order of a. Now suppose that a is algebraic of 
order d > 2. Show that all the roots of a minimal polynomial f are irrational. 


2.3.13. Suppose that the algebraic order of a is d > 2. Then there is ac > 0, 
such that 

m c 

a- =) ><, n,m> 1. 

n n 

(See Exercise 1.4.9. Here you will need the modulus of continuity fa at a of 


g(x) = f(a)/(a — a), where f is a minimal polynomial of a.) 
2.3.14. Use the previous exercise to show that 


1 1 1 a 
1100010:..10- = 2 ta tae t=), 


1 
10”! 
n=1 


is transcendental. 


2.3.15. For s > 1 real, > _, n~* converges. 
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2.3.16. If a > 1, b > 0, and c > 0, then bla? = cla”, and Sy, 5 18s” 
converges. 


2.3.17. Give an example of an f : [0,1] — [0,1] that is invertible but not 
monotone. 


2.3.18. Let f be of bounded variation (Exercise 2.2.4) on (a,b). Then the 
set of points at which f is not continuous is at most countable. Moreover, 
every discontinuity, at worst, is a jump. 


2.3.19. Let f : (a,b) > R be continuous and let M = sup{ f(z) :a< a < bd}. 
Assume f(a+) exists with f(a+) < M and f(b—) exists with f(b—) < M. 
Then sup{ f(z): a< a < b} is attained. Use Theorem 2.1.2. 


2.3.20. If f : R — R satisfies 


lim ihe) = +00, 


LTC |x| 


we say that f is superlinear. If f is superlinear and continuous, then the sup 
is attained in 


gfy)= sup [zy—f(z)}= max [xy — f(x), 


—00<4@<oo —CO<B <0 
and g is superlinear. Use Exercise 2.3.19. 


2.3.21. If f : R — R is superlinear and continuous and g is as above, then 
g is also continuous. (Modify the logic of the previous solution.) 


2.3.22. Let f(x) = 1+ |2|—2a, x € R, where || denotes the greatest integer 
<x (Figure 2.3). Compute 


lim ( lim [ Fata") 


nZ7o \moo 
for « € Q and for ¢ Q. 


2.3.23. Let f(x) = 1/2, 0 < x < 1. Compute p,(5) explicitly for0 <c<1 
and 6 > 0. With I = (0,1), show that y7(6) = oo for all 6 > 0. Conclude 
that f is not uniformly continuous on (0,1). (There are two cases, c < 6 and 
c> 0.) 


2.3.24. Let f : R — R be continuous, and suppose that f(oo) and f(—oo) 
exist and are finite. Show that f is uniformly continuous on R. 


2.3.25. Use Je to show that there are irrationals a,b, such that a? is 
rational. (Consider the two cases J € Q and ar ¢Q.) 


Chapter 3 
Differentiation 


3.1 Derivatives 


Let f be defined on (a, b), and choose c € (a,b). We say that f is differentiable 
at c if 


im £0 - FO 


@c LZ-C 


exists as a real, i.e., exists and is not +oo. If it exists, we denote this limit 
d, 

f'(c) or  ), and we say that f’(c) is the derivative of f at c. If f is 
xv 

differentiable at c for all a < c < b, we say that f is differentiable on (a,b) 

or, if it is clear from the context, differentiable. In this case, the derivative 

f’ : (a,b) > R is a function defined on all of (a, b). 

For example, the function f(z) = mx + b is differentiable on R with 
derivative f’(c) =m for all c since 


km (ee +b) = (me+b) _ 1 a =m, 
xL->C x—C LC 


Since its graph is a line, the derivative of f(a) = mx + b (at any real) 
is the slope of its graph. In particular, the derivative of a constant function 
f(x) = 6 for all x is zero. 

If f(x) = x”, then f is differentiable with derivative 


2 2 
RT eee : T—c)(x+ec ; 
f'(c) = lim = lim ede?) = lim(# +c) = 2c. 
moc £—-C xZ>C L—C @—c 
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If f is differentiable at c, then 


tim, #(2) = im |(FO=L9) (0) + 10) 
= tin (FEELO) tila 0) + 10) = F(0) 04 0 = f00. 


So f is continuous at c. Hence, a differentiable function is continuous. 
However, f(x) = |x| is continuous at 0 but not differentiable there since 


fel — lO} _ 


lim 1, 
20+ 2-0 
whereas 
|x| — |0| 
] =-l1. 
z30- xr—O0 
However, 
x 
(Iz|’=_, 20, 
[2 
since |r| = x; hence, (|x|)/ = 1 on (0,00), and |a| = —2; hence, (|x|)/ = —1 
on (—oo, 0). 


Derivatives are computed using their arithmetic properties. 


Theorem 3.1.1. If f and g are differentiable on (a,b), and k is real, so are 
f+, kf, fg, and, fora<a<b, 


(f + 9)'(@) = f(x) + 9'(2), 


f 
(kf)'(@) = kf'(x), 
f"(@) g(x) + f(a)g'(@). 


Moreover, if g is nonzero on (a,b), then f/g is differentiable and 


a<a<b. 


The first and second identities are linearity of the derivative, the third is 
the product rule, and the last is the quotient rule. To derive these rules, let 
a<c<_b. For sums, 


+ a)'(c) = tim (FO) +902) - FO) + 9(6)) 


Lc g—c 
— jm fM=fO , 4, GI= 9H) 
©w—ec gL—C we gc 


= fle) + 9'(e). 
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For scalar multiplication, 


(f)'(c) = lim SFM) 


x2—c x—-C 
= lim 29 F9) kf'(c). 
“ec z-cC 


For products, 


(fa)'(c) = im LW9@) = Fs) 


hn Lae) = Foal) + Hale) = FOa6 
= fin AZO tio) +f) tn SAE 
= f'o)a(©) + Flo)g'(o)- 


For quotients, 


rc Z-C 


Be aaa) 
f(a - Fao) 


Above, we saw that the derivative of f(x) = x is f’(a) = 1. By induction, 
we show that the derivative of the monomial f(x) = x” is nx"~1. Since this 
is true for n = 1, assume it is true for n > 1. Then by the product rule if 
f(z) =a", 


rays ary = (22)! = (a")'2+2"(2) =ne" 12 + 2"(1) = (n+1)2". 


This establishes that (x)! = nz”—! for all n > 1. Explicitly, this means 
(x” — c”)/(x — c) converges to nc”™~! as x — ¢, for all c real. A more vivid 
description of this convergence is given in (3.4.4). 

Since polynomials are linear combinations of monomials, they are differ- 
entiable everywhere. For example, 


(a? + 5a +1) = (2?) + (5a) + (1)’ = 327 +5. 
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Moreover, 
(x")!'=na"", neéeZ,cHX0. (3.1.1) 


This is clear for n = 0, whereas, for n > 1, using the quotient rule, we find 


that 
ory = (LY = Wer a1ey 
gn (a)? 
0-2” —nzx"} n _ 
= p2n = "n+ = ie 
This establishes (3.1.1). Another consequence of the quotient rule is that a 
rational function is differentiable wherever it is defined. For example, 


(3 = +) (Qax)(x? + 1) — (x? — 1)(2c) Ax 


z+i}) 


ry +I 


We say that a function g is tangent to f atc if the difference f(x) — g(x) 
vanishes faster than first order in x — ¢, i.e., if 


am LO - 92) _ 9 


zc LC 


Suppose that g(a) = ma+b is tangent to f at c. Since the graph of g is a line, 
it is reasonable to call it the line tangent to f at (c, f(c)) or, more simply, the 
tangent line at c. Note two lines are tangent to each other iff they coincide. 
Thus, a function f can have, at most, one tangent line at a given real c. 


(x, f(x)) 


Fig. 3.1 The derivative is the slope of the tangent line 


If f is differentiable at c, then g(a) = f’(c)(a2 — c) + f(c) is tangent to f 
at c, since 


f(x) ~ 9(@) 


lim = lim 
LC x—-C @w-e Gb iG 
= tim F@)-FO) _ f'(c) =0 
Lc r—C 


Hence, the derivative f'(c) of f at c is the slope of the tangent line at c 
(Figure 3.1). 
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If f is differentiable at c, there is a positive k and some interval (c—d, c+d) 
about c on which 


|f(x) — f(o)| < kla — dl, c-d<a<c+d. (3.1.2) 


Indeed, if this were not so for each n > 1, we would find a real a, € (c — 
1/n,c+1/n) contradicting this claim, i.e., satisfying 


— — flo) 


n—e€ 


>n. 


But then x, — c, and hence, this inequality would contradict differentiability 
at c. 
The following describes the behavior of derivatives under composition. 


Theorem 3.1.2 (Chain Rule). Let f, g be differentiable on (a,b), (c,d), 
respectively. If f((a,b)) C (c,d), then go f is differentiable on (a,b) with 


(go f(a) =9'(f(@))f'(@), a<a<b. 


To see this, let a < c < b, and assume, first, f’(c) 4 0. Then x, — c and 
tn # c for alln > 1 imply f(a) > f(c) and (f(an) — f(e))/(an — 0) > 
f’(c) # 0. Hence, there is an N > 1, such that f(z,) — f(c) £0 forn > N. 
Thus, 


van eflen)) 91 F0) — 4, Fle) — 9K) , £0) - £0 
no Ln — C nZo — f(tn) — fle) Tn —€ 
in Elen) 9K) 5 fen) - £0 
noo (tn) — fle) nAoo an—e 
=I FO)FO 


Since 2, > c and x, # c for all n > 1, by definition of lim,_,. (§2.2), 


(go f)"(c) = lim LE) = 9 FO) 


mc L-C 


=I (FO)F(O.- 


This establishes the result when f’(c) 4 0. If f’(c) = 0, by (3.1.2) there is a 
k with 


lay) — 9(F(e))] S kly — FO 


for y near f(c). Since x + c implies f(a) > f(c), in this case, we obtain 


(go f)(o)| = tim |Z) = 9 FO) 


LC c—eC 
< tim Mf(@) - FO 
@2Cc |x _— c| 


= kl f'(o)| = 0. 
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Hence, (go f)’(c) =0=g9'( fF) FO). 


For example, 


(BY IG) 


follows by choosing g(x) = x” and f(#) =1—a/n,0<a<n. 
If we set u = f(x) and y = g(u) = g(f(x)), then the chain rule takes the 
easily remembered form 


dy dy du 


dx du dx 


We say that f : (a,b) > R has a local maximum at c € (a,b) if, for some 
5 > 0, f(a) < f(c) on (c— 6,c+ 0). Similarly, we say that f has a local 
minimum at c € (a,b) if, for some 6 > 0, f(a) > f(c) on (c— 6,c4+ 4). 
Alternatively, we say that c is a local max or a local min of f. If, instead, 
these inequalities hold for all x in (a,b), then we say that c is a (global) 
maximum or a (global) minimum of f on (a,b). It is possible for a function 
to have a local maximum at every rational (see Exercise 3.1.9). 

A critical point of a differentiable f is a real c with f’(c) = 0. A critical 
value of f is a real d, such that d = f(c) for some critical point c. 

Let f be defined on (a,b). Suppose that f has a local minimum at c and 
is differentiable there. Then for x > c near c, f(x) > f(c), so 


f(x) = fle) 


= 0. 
For x < c near c, f(x) > f(c), so 


f'(c) = lim f(z) = Ko) <0. 
@2—c— T—C 

Hence, f’(c) = 0. Applying this result to g = —f, we see that if f has a local 
maximum at c, then f’(c) = 0. We conclude that a local maximum or a local 
minimum is a critical point. The converse is not, generally, true since c = 0 
is a critical point of f(x) = x* but is neither a local maximum nor a local 
minimum. 

Using critical points, one can maximize and minimize functions over their 
domains: To compute 

M= sup f(z), 
a<x2<b 

either the sup is attained at some a < x < b or M = f(a+) or M = f(b), 
assuming these exist (Exercise 2.3.19). When f is differentiable, it is enough 
therefore to compute the critical values of f and compare them with f(a+) 
and f(b—). If the largest of these values is f(c) for some critical point c € 
(a,b), then f is maximized at c. If the largest of these values is f(b—) or 
f(a+), then f has a sup but no maximum over (a,b). For example, 
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max (62—27)=9 
—00<@%<0o 
since the only critical point of f(x) = 6x — 2? is at x = 3 and f(c) = 
f(—oo) = —oo. Proceed similarly for computing minima. 


Theorem 3.1.3 (Mean Value Theorem). /f f is continuous on [a,b] and 
differentiable on (a,b), then there is ac in (a,b) with 


To see this (Figure 3.2), first we subtract a line from f by setting 


f(0) = f(@) 


b-a 


ate) = se) - {| Je-o+re}, a<a<b. 
Then g is continuous on [a,b], differentiable on (a,b), and g(a) = g(b) = 0. If 
g(x) = 0 everywhere on [a, b], let a << c < b be any real. If g(a) > 0 somewhere 
in (a,b), let c be a real at which g is maximized. If g(a) < 0 somewhere in 
(a,b), let c be a real at which g is minimized. In all three cases, we obtain 
g'(c) = 0. Since 


we are done. 


Fig. 3.2 The mean value theorem 


For example, choose f(x) = (1 — a/n)", a = 0, b > 0. Then f’(x) = 
—(1—<2/n)"~! is between —1 and 0 when 0 < x < n. By the mean value 
theorem, we conclude that 


(1 — b/n)” 


gehe <i] 


b —_ >) 
since the ratio equals the negative of (f(b) — f(0))/(b — 0). The point of 
this inequality is that, when b > 0 is small, the numerator is small enough to 
compensate for the smallness of the denominator, yielding a quotient bounded 
between 0 and 1. 

As a consequence of the mean value theorem, if f and g are differentiable 
on (a,b) and f'(x) = g'(x) for allx, then f and g differ by a constant; f(x) = 


0<b<n, n>1, 
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g(x) + C. To see this, note that h(a) = f(x) — g(x) satisfies h’(x) = 0, so by 
the mean value theorem (h(c) — h(d))/(c— d) equals h’ at some intermediate 
real. Hence, h(c) = h(d); hence, h is a constant function. 

Let (—b,b) be an interval symmetric about 0. Given a function f : 
(—b,b) > R, its even part f® is the function 


and its odd part f° is 
f?(2) = f(x) SC 


Clearly, f = f© + f°. 

A function f is even over (—b, b) if f = f* on (—b,b) and odd over (—b, 6) 
if f = f° on (—0, b). Thus, an even function satisfies f(—x) = f(x) on (—b,b), 
whereas an odd function satisfies f(—x) = —f(x) on (—b,b). For example, 
x” is even or odd on R according to whether n is even or odd. 


Exercises 


3.1.1. Let a > 0 and define f(x) = |a|*. Show that f is differentiable at 0 iff 
a>l. 


3.1.2. Define f : R > R by setting f(a) = 0, when « is irrational, and 
setting f(m/n) = 1/n3 when n > 0 and m have no common factor. Use 
Exercise 1.4.9 to show that f is differentiable at /2. What is f’(/2)? 


3.1.3. Let f(x) = ax?/2 with a > 0, and set 


giy)= sup (xy—f(z)), yeR. (3.1.3) 


—coo<4r<oo 
By direct computation, show that g(y) = y?/2a and f’ and g’ are inverses. 


3.1.4. If g: R > R is superlinear (Exercise 2.3.20) and differentiable, then 
g'(R) is unbounded above and below; sup g'(R) = oo and inf g/(R) = —o0. 
(Argue by contradiction, and use the mean value theorem.) 


3.1.5. Suppose that f is continuous on (a,b), differentiable on (a,b) punc- 
tured at c, a <c < b, and lim,-_,. f’(z) = L exists. Show that f’(c) exists 
and equals L. 


3.1.6. Suppose that f : (a,b) > R is differentiable, a < c < b, and f’(c+) 
and f’(c—) exist. Show that f’(c+) = f’(c) = f’(c—). (As opposed to the 
previous exercise, here, we assume that f’(c) exists.) 
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3.1.7. Suppose that f is differentiable on a bounded interval (a,b) with 
|f’| <I. Show that f is of bounded variation (Exercise 2.2.4) over (a, }) 
with total variation < I(b— a). 


3.1.8. Show that the function f : R — R in Exercise 2.2.1 has a local 
maximum at every c€ Q. 


3.1.9. Suppose that f : (—b,b) > R is differentiable. Then f’ is even or odd 
if f is odd or even, respectively. 


3.1.10. Suppose that f : R — R is continuous on R and f is differentiable 
at r © R. We say r is a root of f if f(r) = 0. Show that r is a root of f iff 
f(a) = (x — r)g(x) for some continuous function g: R > R. 


3.1.11. Suppose f : R > R is continuous, and suppose f is differentiable at 
d distinct reals r,,...,7ra. Show that r1, r2,..., rq are roots of f iff 


f(x) = (@ — r1)(a — 12)... (t — ra)g(a) 
for some continuous function g: R > R. 


3.1.12. Let f: R- R be differentiable. Show that if f has d distinct roots 
r1,..-,Ta, then f’ has d— 1 distinct roots s;,...,5q—1, where the s,’s are 
distinct from the r;’s. 
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To differentiate roots, we need to know how derivatives of inverses behave. 
But continuous functions are invertible iff they are strictly monotone (§2.3), 
so we begin by using the derivative to identify monotonicity. 


Theorem 3.2.1. Let f be differentiable on (a,b). If f'(x) #0 fora<a<b, 
then f is strictly monotone on (a,b) and f(x) > 0 on (a,b) or f’(a) < 0 
on (a,b). Moreover, f'(x) > 0 on (a,b) iff f is increasing, and f’(x) <0 on 
(a,b) iff f is decreasing. 


By the mean value theorem, given a < x < y < b, there is a c in (2, y) 
satisfying 
f(y) — f(z) =f (OW-2). 


If f’ is never zero, this shows that f is injective, hence, strictly monotone 
by the inverse function theorem (§2.3). This also shows that f’ > 0 on (a,b) 
implies f is increasing and f’ < 0 on (a, b) implies f is decreasing. Conversely, 
increasing f implies f(x) > f(c) for x > c, so 
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for alla < c < b. Similarly, if f is decreasing. In particular, we conclude that 
if f’ is never zero and f is monotone, we must have f’ > 0 on (a,b) or f’ <0 
on (a, b). 
It is not, generally, true that strict monotonicity implies the nonvanishing 
of f’. For example, f(x) = x? is strictly increasing on R, but f’(0) = 0. 
Since its derivative was computed in the previous section, the function 


x? —1 


f= 741 


is strictly increasing on (0,00) and strictly decreasing on (—oo, 0). Thus, the 
critical point x = 0 is a minimum of f on R. 

A useful consequence of this theorem is the following: If f and g are dif- 
ferentiable on (a,b), continuous on [a,b], f(a) = g(a), and f'(x) > g'(x) on 
(a,b), then f(x) > g(a) on [a,b]. This follows by applying the theorem to 
h=f-gq. 

Another consequence is that derivative functions, although themselves 
not necessarily continuous, satisfy the intermediate value property (Exer- 
cise 3.2.8). 

Now we can state the inverse function theorem for differentiable functions. 


Theorem 3.2.2 (Inverse Function Theorem). Let f be continuous on 
[a, b] and differentiable on (a,b), and suppose that f'(x) #4 0 on (a,b). Let 
[m, M] = f({a,0]). Then f : [a,b] > [m, M] is invertible, and its inverse g 
is continuous on |m, M], differentiable on (m,M), and g'(y) £0 on (m, M). 
Moreover, 


: _ 1 
IW) = Fay 


Note, first, that f’ > 0 on (a,b) or f’ < 0 on (a,b) by the previous theorem. 
Suppose that f’ > 0 on (a,b), the case f’ < 0 being entirely similar. Then 
f is strictly increasing; hence, the range [m, M] must equal [f(a), f(b)], f is 
invertible, and its inverse g is strictly increasing and continuous. If a < c < b 
and yn > f(c), yn # f(c) for all n > 1, then x, = g(yn) > g(f(c)) = ¢ and 
tn #C for alln > 1, so yn = f(an), n> 1, and 


m<y<M. 


tim GU) —~ GF) _ yn tare _ 1 


Since (y,) is any sequence converging to f(c), this implies 


J (f(c)) = lim gy) — 9(F()) _ 1 


yf) ¥ — Fle) fi(e) 

Since y = f(c) iff c= g(y), the result follows. 
This result is false if the hypothesis is weakened to the nonvanishing of 
the derivative at a single point: It is possible for a differentiable function 
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to satisfy f’(0) 4 0 but not be injective on any interval containing zero 
(Exercise 3.6.7). 

As an application, let b > 0. Since for n > 0, the function f(x) = x” is 
continuous on [0,b] and f’(2) = nx"-! 4 0 on (0,0), its inverse g(y) = y!/” 
is continuous on [0,6”] and differentiable on (0, 6b") with 


1 1 1 1 (1/n)-1, 


MO FE ~ mow? ~ mye = 


Since b > 0 is arbitrary, this is valid on (0,00). Similarly, this holds on (0, co) 
for n < 0. 

By applying the chain rule, for all rationals r = m/n, the power functions 
f(x) = a = a” = (2™)/” are differentiable on (0,00) with derivative 
f'(#) =ra"—*, since 


n ( 1 m n— m 
f@) = (@™)") == @ryr any 
_ 1 (n[n)—myy ym <3 MY (m/n)—1 = rot. 
nr n 


Thus, the derivative of f(x) = x" is f'(x) =ra™~/ for z > 0, for allr € Q. 
Using the chain rule, we now know how to differentiate any algebraic 
function. For example, the derivative of 


1-2? 
f@)=Y o<e<h 


f(a) if (; 2\ 1? —Ar —2x ee 

cC)z=- —— . ae Eat Te x . 
2. el ae (aa) (1+ 2?)/1 — 24 

To compute the derivative of f(x) = x* when a > 0 is not in Q, let r < s 


be rationals with r < a < s, and consider the limit 


a? —1 


im ; 
zl+ x—1 


(3.2.1) 


Since for any %, — 1+, the sequence B, = (x% — 1)/(a, — 1) lies between 
the sequences A, = (x) — 1)/(a, — 1) and OC, = (a5, — 1)/(v, — 1), the 
upper and lower limits of (B,) lie between lim, ». An = r1"~! = r and 
limn Aco Cn = 818~! = s. Since r < a < 8 are arbitrary, the upper and lower 
limits both equal a; hence, 


By = (e% —1)/(am —1) > a 


thus, the limit (3.2.1) equals a. Since f(x) = 1/z is continuous at x = 1, 
Ln — 1— implies y, = 1/a, + 1+, so 
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| 1/y2-1 ay 
lim on = lim Wyn =} _ tin ioe, oe gig: 
nZ© Ln —1 n oo 1/Yyn—1 n Zoo Yn —1 
Thus, 
Oe 


Hence, f(x) = x° is differentiable at x = 1 with f’(1) =a. Since 


a_ 7a ee | 
=e! jim Ee et =—ao%, 


wc GT —C z/e>l (a/c) —1 


f is differentiable on (0,00) with f’(c) = ac*~'. Thus, for all real a > 0, the 
derivative of f(z) = x at x > 0 is f’(x) = ax*~!. Using the quotient rule, 
the same result holds for real a < 0. 

As an application, let v be any real greater than 1. Then by the chain 
rule, the derivative of f(z) = (1+ 2)” —1— va is f'(x) = v(1+ 2)" 1-9; 
hence, the only critical point is « = 0. Since f(—1) = -1+v>0= f(0) and 
f(co) = oo, the minimum of f over (—1,0o) is f(0) = 0. Hence, 


(1+b)?>1+ub, b> -1. (3.2.2) 


We already knew this for v a natural (Exercise 1.3.22), but now we know 
this for any real v > 1. 
Now we compute the derivative of the exponential function f(x) = a® with 
base a > 1. We begin with finding f’(0). 
IfO0<a<yanda > 1, then insert v = y/x > 1 andb=a*—1>0in 
(3.2.2), and rearrange to get 


a® —1 ay —1 
< 


Gy 


‘ O<aK<y. 


Thus, 


i a” —1 
m — 
+ areal x 
exists since it equals 

inf{(a” — 1)/x: a2 > 0} 


(Exercise 2.2.2). Moreover, m+ > 0 since a? > 1 for x > 0. Also 


: —1 ; — . _, a—-l 
m= lim = lim ——= lim a” - 
x2—0-— x x—0+ =f x—0+ 


=1-m,=my4. 


Hence, the exponential with base a > 1 is differentiable at x = 0, and we 
denote its derivative there by m(a). Since a® = a°a*~, 
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Hence, f(x) = a” is differentiable on R, and f’(x) = a*m/(a) with m(a) > 0. 


If m(a) = 0, then f’(a) = 0 for all x; hence, a® is constant, a contradiction. 
Hence, m(a) > 0. Also for b> 1 anda > 1, 


m(b) = 


| 
5 
T 
B 


lI 
ir 
5 
II 
3 
— 
a 
7 
—_ 
io) 
jefe} 
8 
a 


by the chain rule. By fixing a and varying b, we see that m is a continu- 
ous, strictly increasing function with m(oo) = oo and m(1+) = 0. By the 
intermediate value property §2.2, we conclude that m((1,0o)) = (0,00). 

Thus, and this is very important, there is a unique real e > 1 with 
m(e) = 1. The exponential and logarithm functions with base e are called 
natural. Throughout the book, e denotes this particular number. We summa- 
rize the results. 


Theorem 3.2.3. For all a > 0, the exponential f(x) = a® is differentiable 
on R. There is a unique real e > 1, such that f(x) = e? implies f'(x) = e*. 
More generally, f(x) = a® implies f'(x) = a® log, a. 


For a > 1, this was derived above. To derive the theorem for 0 < a < 1, 
use a® = (1/a)~* and the chain rule. 

In the sequel, log x will denote log, x, i.e., we drop the e when writing the 
natural logarithm. Then 


elSt — x loge” = a. 


We end with the derivative of f(a) = log, x. Since this is the inverse of 
the exponential, 
1 1 


’ oe 
F(z) = af(@loga  aloga 


Thus, f(x) = log, x implies f’(x) = 1/aloga, x > 0. In particular loge = 1, 


so f(a) = logx implies f’(x) = 1/2, x > 0. 
For example, combining the above with the chain rule, 


1 
(oglal’ ==, #0. 


Another example is (x #4 +1) 
gel 2) )) a4 a-1\' (+1 a 2 
2 a (SS) etij/ \a@2-1/ (#41)? 22-1 


We will need the following in §3.5. 
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Theorem 3.2.4 (Generalized Mean Value Theorem). If f and g are 
continuous on [a,b], differentiable on (a,b), and g'(x) 4 0 on (a,b), there 
exists ac in (a,b), such that! 


Either g’ > 0 on (a,b) or g’ < 0 on (a,b). Assume g’ > 0 on (a,b). To see 
the theorem, let h denote the inverse function of g, so h(g(x)) = a, and set 
F(a) = f(h(x)). Then F(g(x)) = f(a); F is continuous on [g(a), g(b)] and 
differentiable on (g(a), g(b)). So applying the mean value theorem, the chain 
rule, and the inverse function theorem, there is a d in (g(a), g(b)), such that 


g(b) — g(a) g(b) — g(a) 
= F'(d) = f'(h(d))h'(d) 
_ fi(h(d)) 
g (h(d)) 


Now let c = h(d). Then c is in (a,b). The case g’ < 0 on (a,b) is similar. 
We end the section with l’H6pital’s rule. 


Theorem 3.2.5 (L’H6pital’s Rule). Let f and g be differentiable on an 
open interval (a,b) punctured at c, a<c< b. Suppose that lim,-,. f(x) = 0 
and lime-+. g(x) = 0. Then g'(x) £0 fora #c and 


imply? 
(3.2.3) 


To obtain this, define f and g at c by setting f(c) = g(c) = 0. Then f and 
g are continuous on (a,b). Now let x, — c+. Apply the generalized mean 
value theorem on (c,%,,) for each n > 1. Then 


f(@n) = fGal—7e) = f'(dn) oL 


Gn) g(%n)— ge) g!(dn) 
since c < dy, < ay. Similarly, this also holds when x, — c—, and thus, this 
holds for x, — c, which establishes (3.2.3). 
The above deals with the “indeterminate form” f(x)/g(x) — 0/0. The 
case f(x)/g(x) > oo/co can be handled by turning the fraction f(x)/g(x) 
upside down and applying the above. We do not state this case as we do not 
use it. 


1 g(b) — g(a) is not zero because it equals g’(d)(b — a) for some a < d < b. 
2 g(x) £0 for « #c since g(x) = g(x) — g(c) = g'(d)(a — 0). 
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Exercises 


3.2.1. Show that 1+ 2 < e” for x > 0. 


3.2.2. Use the generalized mean value theorem to show (1+ 2)* <1+4 2% 
for c > 0 and 0 < a < 1. By induction, conclude 


(ay tees et, )” Sy eee 
fora; >0, 7 =1,...,n. 
3.2.3. Show that limz_,o log(1 + x)/x = 1 and 

pan ein) es aceR 


(take the log of both sides). If a, — a, show also that limp 7oo(1 + an/n)” 


=e". 


3.2.4. Let « € R. Show that the sequence (14+ 2/n)", n > |x| increases to 
e* as n 7 oo (use (3.2.2)). 


3.2.5. Use the mean value theorem to show that 


1 
La Sg’, xz>0. 
V1l+22 — 
3.2.6. Use the mean value theorem to show that 
1 1 x 


a ye i>1 
Gi GPS Gap Feber e 


3.2.7. Let f : R > R be differentiable with f(0) = 0 and |f’(x)| < |f(x)| for 
all x. Show that f is identically zero. 


3.2.8. If f : (a,b) > R is differentiable, then f’ : (a,b) > R satisfies the 
intermediate value property: If a < c < d< band f'(c) < L < f'(d), 
then L = f’(x) for some c < a < d. (Start with L = 0; then consider 
g(x) = f(x) — La. Here the point is that f’ need not be continuous.) 


3.2.9. If f : R > R is superlinear and differentiable, then f’(R) = R, ie., 
f’ is surjective (Exercise 3.1.4). 


3.2.10. For d > 2, let 
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Show that 


Conclude that 


3.3 Graphing Techniques 


Let f be differentiable on (a,b). If f’ = df/dzx is differentiable on (a,b), we 
denote its derivative by f” = (f’)’; f” is the second derivative of f. If f” 
is differentiable on (a,b), f’” = (f”)’ is the third derivative of f. In general, 
we let f(™ denote the nth derivative or the derivative of order n, where, by 
convention, we take f() = f. If f has all derivatives, f’, f”, f’”,..., we say 
f is smooth on (a,b). 

An alternate and useful notation for higher derivatives is obtained by 
thinking of f’ = df/dz as the result of applying d/dzx to f, ie., df/dx = 
(d/dx)f. From this point of view, d/dx signifies the operation of differentia- 
tion. Thus, applying d/dx twice, we obtain 


ro d d _ d2 aes i 
f = (5) (a) f= (ae) f= a 


Similarly, third derivatives may be denoted 


m —_ { @ ep eF 
j =(z) (a) =e 


For example, f(x) = x? has f’(x) = 2a, f”(x) = 2, and f(x) = 0 for 
n > 3. More generally, by induction, f(x) = x”; n > 0 has derivatives 


k n—k 
(@SS _1@=mn ° StS” (3.3.1) 
dar 0, k>n, 
so f(x) = x” is smooth. By the arithmetic properties of derivatives, it follows 
that rational functions are smooth wherever they are defined. 

Not all functions are smooth. The function f(x) = |z| is not differentiable 
at zero. Using this, one can show that f(x) = x" |a| is n times differentiable on 
R, but f is not differentiable at zero. More generally, for f, g differentiable, 
we do not expect max(f,g) to be differentiable. However, since f(x) = «!/” 
is smooth on (0, co), algebraic functions are smooth on any open interval of 
definition. Also the functions x*, a*, and log, x are smooth on (0, 00). 
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We know the sign of f’ determines the monotonicity of f, in the sense that 
f’ > 0 iff f is increasing and f’ < 0 iff f is decreasing. How is the sign of 
f” reflected in the graph of f? Since f” = (f’)’, we see that f” > 0 iff f’ is 
increasing and f” < 0 iff f’ is decreasing. 

More precisely, we say f is convex on (a,b) if, for alla<a<y< 0b, 


f(l-Ha+ty)<(-tH)f@)+tfy), O<t<. 


Take any two points on the graph of f and join them by a chord or line 
segment. Then f is convex if the chord lies on or above the graph (Figure 3.3). 
We say f is concave on (a,b) if, for alla<a<y<b, 


f(l-a+ty) > 1-H)f@)+tfy), O<t<. 


Take any two points on the graph of f and join them by a chord. Then f is 
concave if the chord lies on or below the graph. 


Fig. 3.3 Examples of convex and strictly convex functions 


For example, f(a) = x? is convex and f(x) = —2? is concave. A function 
f : (a,b) > R that is both convex and concave is called affine. It is easy to 
see that f : (a,b) > R is affine iff f’ = m is constant on (a, b). 

Similarly, we say that f is strictly convex on (a,b) if, foralla<a<y<b, 


f(A —that+ty) < 1—-t)f(x)+tf(y), O0<t<1. 


Take any two points on the graph of f and join them by a chord. Then f is 
strictly convex if the chord lies strictly above the graph. Similarly, we define 
strictly concave. 

Note that a strictly convex f : (a,b) — R cannot attain its infimum m 
at more than one point in (a,b). Indeed, if f had two minima at x and 2’ 
and x” = (x +a’)/2, then f(x”) < [f(x) + f(a’)]/2 = (m+ m)/2 = m, 
contradicting the fact that m is a minimum. 

The negative of a (strictly) convex function is (strictly) concave. 


Theorem 3.3.1. Suppose that f is differentiable on (a,b). Then f is convex 
iff f' is increasing, and f is concave iff f' is decreasing. Moreover, f is strictly 
convex iff f' is strictly increasing, and f is strictly concave iff f' is strictly 
decreasing. If f is twice differentiable on (a,b), then f is convex iff f” > 0, 
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and f is concave iff f" <0. Moreover, f is strictly convex if f” > 0, and f 
is strictly concave if f"” <0. 


Since —f is convex iff f is concave, we derive only the convex part. First, 
suppose that jf’ is increasing. Ifa <2 <y<band0 <t <1, let z = 
(1 — t)a + ty. Then 


nC) 
for some x <c< z. Also 
Yi 2 
for some z < d < y. Since f’(c) < f’(d), 
f@)= Fle) - SW) -1@) ae 
Z-2£ - YZ 


Clearing denominators in this last inequality, we obtain convexity. Conversely, 
suppose that f is convex, and let a<a<z<y< _b. Then we have (3.3.2). 
Ifa<u<z<y<w <b, apply (3.3.2) toa<u<z<y < band then 
(3.3.2) toa<z<y<w <b. Combining the resulting inequalities yields 


f= f(@) - fw) =f) 


Z-2 a w-y 
Fixing x, y, and w and letting z > x yields 


w-y 


Let w + y to obtain f’(x) < f’(y); hence, f’ is increasing. If f’ is strictly 
increasing, then the inequality (3.3.2) is strict; hence, f is strictly convex. 
Conversely, if f’ is increasing but f’(c) = f’(d) for some c < d, then f’ is 
constant on [c,d]. Hence, f is affine on [c,d] contradicting strict convexity. 
This shows that f is strictly convex iff f’ is strictly increasing. 

When f is twice differentiable, f” > 0 iff f’ is increasing, hence, the third 
statement. Since f” > 0 implies f’ strictly increasing, we also have the fourth 
statement. 

A key feature of convexity (Figure 3.4) is that the graph of a convex 
function lies above any of its tangent lines (Exercise 3.3.7). 

A real c is an inflection point of f if f is convex on one side of c and 
concave on the other. For example, c = 0 is an inflection point of f(x) = x? 
since f is convex on « > 0 and concave on x < 0. From the theorem, we see 
that f”(c) = 0 at any inflection point c where f is twice differentiable. 

If c is a critical point and f”(c) > 0, then f’ is strictly increasing near 
c; hence, f’(a) < 0 for « < c near c and f’(#) > 0 for > c near c. Thus, 
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Fig. 3.4 A convex function lies above any of its tangents 


f'(c) = 0 and f’(c) > 0 implies c is a local minimum. Similarly, f’(c) = 0 
and f”(c) < 0 implies c is a local maximum. The converses are not, generally, 
true since c = 0 is a minimum of f(x) = x*, but f’(0) = f”(0) =0. 

For example, f(z) = a”, a > 1 satisfies f’(x) = a* loga and f"(x) = 
a*(loga)*. Since loga > 0, a® is increasing and strictly convex everywhere. 
Also f(x) = log, x has f’(x) = 1/rloga and f”(x) = —1/z? loga, so log, x 
is increasing and strictly concave everywhere. The graphs are shown in Fig- 
ure 3.5. 


log, x 


Fig. 3.5 The exponential and logarithm functions 


In the following, we sketch the graphs of some twice differentiable functions 
on an interval (a,b), using knowledge of the critical points, the inflection 
points, the signs of f’ and f”, and f(a+), f(b—). 

If f(x) =1/(@? +1), -0o < x < ov, then f(x) = —2x/(x* + 1)”. Hence, 
f(x) < 0 for x > 0 and f’(x) > 0 for  < 0, so f is increasing for z < 0 and 
decreasing for x > 0. Hence, 0 is a global maximum. Moreover, 


f(a) = ee _ bat 2 

@ +f) ~ @ +p 
so f”(0) < 0 which is consistent with 0 being a maximum. Now f”(x) < 0 
on |a| < 1/V3 and f”(x) > 0 on |a| > 1/73. Hence, « = +1/V3 are 


inflection points. Since f(0) = 1 and f(co) = f(—co) = 0, we obtain the 
graph in Figure 3.6. 
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Fig. 3.6 f(x) =1/(1+27) 


Let f(x) =1/\/2(1 — 2), 0<a2 <1. Then 


iis =. 2% —1 
'@)= 60 -a)P 
and 
Py _ sta l—@) 
PO) = qed ay 


Thus, « = 1/2 is a critical point. Since f”(1/2) > 0, x = 1/2 is a local 
minimum. In fact, f is decreasing to the left of 1/2 and increasing to the 
right of 1/2; hence, 1/2 is a global minimum. Since 3 — 82(1— x) > 0 on 
(0,1), f’ (a) > 0; hence, f is convex. Since f(0+) = 00 and f(1—) = o, the 
graph is as shown in Figure 3.7. 


Fig. 3.7 f(x) =1//a(1— 2) 


1 
Let f(x) = a This rational function is defined away from x = 0 and 
a(l-—=z 
x = 1. Thus, we graph f on the intervals (—oo,0), (0,1), (1,00). Computing, 
3a? +24 —1 
/ — 
f (x) im x2(1— x)? . 
Solving 3a? + 2x —1 = 0, x = —1,1/3 are the critical points. Moreover, 


f(—oo) = 0. Since there are no critical points in (1,00), f is increasing on 
(1,00). Moreover, 
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ip OR = Gall a) 2 
f'(«) = ar) 


so f is concave on (1,00). Moreover, the numerator in f’(x) is > 1/2 on 
(0,1). Hence, f” (a) > 0 on (0,1). Hence, f is convex in (0,1) and x = 1/3 
is a minimum of f. Since f”(—1) = —-1 < 0, « = —1 is a maximum. Since 
f" (x2) > 0+ as x > —oo, there is an inflection point in (—co, —1). Thus, the 
graph is as shown in Figure 3.8. 


3241 
x(1— 2) 


Fig. 3.8 f(x) = 


x 


To graph 2"e~*, x > 0, we will need to show that 


lim z"e~* = 0. (3.3.3) 


xwL—- CO 
To establish the limit, we first show that 


nm 


2 
eS lteto te to 


= EO (3.3.4) 
nN: 


for all n > 1. We do this by induction. For n = 1, let f(x) = e”, g(x) =1+4+ a. 
Then f’(#) =e” > 1=4g'(x) on x > 0 and f(0) = g(0); hence, f(x) > g(x) 
establishing (3.3.4) for n = 1. Now let g,(a) denote the right side of (3.3.4), 
and suppose that (3.3.4) is true for some n > 1. Since f’(x) = f(x) > gn(x) = 
Gn4i(x) and f(0) = gn+1(0), we conclude that f(x) > gn+1(x), establishing 
(3.3.4) for n+ 1. By induction, (3.3.4) is true for all n > 1. Now (3.3.4) with 
n+ 1 replacing n implies e” > 2”*!/(n + 1)! which implies 
gre < ere), x > 0, 
x 
which implies (3.3.3). 
Setting f,(z) = a"e~*, n>1, fr(0) = 0, and f,(coo) = 0. Moreover, 


fale) =a" (n—a)e? 
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and 
f(a) = 2"? |x? — Inx + n(n — 1)] e~*. 


n 


Thus, the critical point is « = n, and f,, is increasing on (0,7) and decreasing 
on (n, co). Hence, x = n is a max. The reals x = n+,/n are inflection points. 
Between them, fn is concave and elsewhere convex. The graph is as shown 
in Figure 3.9. 


Fig. 3.9 fn(xz) =a"e7* 


If we let n 7 oo in (3.3.4), we obtain 
Hos . x” 
ev>y>> —yr |e. (3.3.5) 
n=0 


As a consequence (nth term test §1.5), 


lim —=0 (3.3.6) 


for all « > 0, hence, for all x. Using (3.3.3), we can derive other limits. 


Theorem 3.3.2. Fora >0,b>0, andc> 0, 


A. limg4o. 2%e7"* = 0; 

B. limy_404 t?(— logt)* = 0; in particular, tlogt + 0 as t > 04; 
C. lim, Ao (log n)*/n® = 0; in particular, logn/n > 0 as n 7 ~; 
D. Ife <1, limn ro n%e" = 0. Ife > 1, limy roo nN %e" = 00; 

E. limp roo n1/" = 1. 


To obtain the first limit, choose n > a, and let y = bx. Then x — oo implies 
y — 00; hence, r%e7® = y%e7¥/b% < y"e~¥/b* — 0 by (3.3.3). Substituting 
t =e” in the first yields the second since e~? + 0+, as > oo. Substituting 
t = 1/n in the second yields the third. For the fourth, in the first, replace 
by n and e~° by c, ife < 1. Ife > 1, n~%c” = 1/n%(1/c)” — 00 by what 
we just derived. For the fifth, take the exponential of both sides of the third 
with a = b = 1. Since e” is continuous, we obtain the fifth. 
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The moral of the theorem is logn <<n << e" asn_/“ oo, where A << B 
means that any positive power of A is much smaller than B. 

Let us use (3.3.1) to derive the binomial theorem. If n > 1, then (c + x)” 
is a polynomial of degree n; hence, there are numbers ag,...,@, with 


f(e) = (c+ 2)" Sane” + age? * +--+ + aye + ap. (3.3.7) 


Let us compute the derivatives of f by using either of the expressions in 
(3.3.7). The left expression and (3.3.1) and the chain rule yield f((0) = 
n!/(n — k)!ce"-*. The right expression yields f)(0) = klay. Hence, a, = 
nic?—* /(n — k)!k!. Now define the binomial coefficient (read “n-choose-k” ) 


(") n! n(n—1)+--(n—k+1) 


’) m=O i »  OSESn. 


Then we obtain the following. 


Theorem 3.3.3 (Binomial Theorem). [fn > 1 anda,beR, then 


(a+b)" =a"+ (Tate te ( " jan +o" 
fee 


which implies 
CYP wa [n\ (oxi ~~ 
(1+ =) — (") (=) <F (3.3.8) 


Let n > o and use Exercise 3.2.3 to obtain 


v=) 5, 220. (3.3.9) 


We say f : RR —- R is superlinear if 


lim fe) = +00. 
wL—>r0o |x| 
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Given f:R—-R, its Legendre transform is the function 


g(y)= sup (xy—f(z)), yER. (3.3.10) 


—oo<4%<oo 
Exercise 2.3.20 shows that the sup is attained; hence, g is well defined when 
f is superlinear and continuous. Note g(y) can be thought of as a sup of lines 
y> xy — f(x), one for each x (compare with Exercise 3.3.11). 

Below is a set of exercises that show that the Legendre transform of a 
convex superlinear function is well defined, and derive the result that the 
Legendre transform of the Legendre transform of a convex superlinear func- 
tion f is f: If g is the Legendre transform of f, then f is the Legendre 
transform of the Legendre transform of g, 


f(x)= sup (ary—g(y)), ceER. (3.3.11) 


—co<y<oo 


Examples of Legendre transforms are given in Exercises 3.1.3 and 3.3.14. 
The perfect symmetry between f and its Legendre transform g is exhibited 
in Exercises 3.3.16, 3.3.20, and 3.3.23. 


Exercises 


3.3.1. Graph f(x) = (a + 2/x)/2 for x > 0. 
3.3.2. Show that f : (a,b) > R is affine iff f’(2) = m is a constant on (a, b). 


3.3.3. Suppose f : (a,b) — (c,d) is convex and g : (c,d) + R is increasing 
and convex. Show that the composition go f : (a,b) + R is convex. 


3.3.4. Let f :R — R be convex, and for b £ a, let 


sla, b] = F(0) = f(a) 
b—a 
Show that a < b < c implies s[a, b] < s[a,c] < s[b, c]. 


3.3.5. Suppose that f : (a,b) > R is convex. Then for all c € (a,b), 


fi.(©) = lim, F(z) = fle) 
wcr Ge 
both exist, and 
ios fo)" 10 < faye) < f'(d), a<c<a<d<b. 


(3.3.12) 
Moreover f’ < f{ and both f{ and f! are increasing on (a,b). 
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3.3.6. If f : (a,b) — R is convex, then f is continuous on (a, b). 
3.3.7. Suppose that f : (a,b) > R is convex and let a < c < b. Then 


f(z) > flO + fi (d(x-), a<a<b. 


In particular, if f is differentiable at c, then the graph of f lies above its 
tangent line at c, 


f(a) >fld+f'(j\(a-o), a<a<b. 


3.3.8. Let f : (a,b) > R and let a<c< _b. A subdifferential of f at cis a 
real p satisfying 


f(x) > f(c) + p(x — oc), a<u<b. 


Show that when f/.(c) both exist, we have f!(c) < p < f{(c). Show also 
that when f is convex, the set of subdifferentials of f at c exactly equals the 


interval [f! (c), f4 (c)]. 


3.3.9. (Maximum Principle) Suppose that f : (a,b) > R is convex and 
has a maximum at some c in (a,b). Then f is constant (use subdifferentials). 


3.3.10. Suppose that f : (a,b) > R is convex, g : (a,b) > R is differentiable, 
and f — g attains its supremum at some a < c < b. Show that f’(c) exists 
and equals g’(c) (use subdifferentials). 


3.3.11. If fi,..., fn are convex on (a,b), then so is 


f = max(fi,..., fn). 


In particular, if f,,..., fn are lines, then f is convex. Exercise 3.3.12 shows 
that this is also sometimes true for infinitely many lines. 


3.3.12. If f : R — R is superlinear and continuous, then the sup is at- 
tained in (3.3.10), the sup is attained in (3.3.11) (Exercise 2.3.20 and Exer- 
cise 2.3.21), and the Legendre transform g is convex. Moreover, for each y, if 
x attains the sup in the definition (3.3.10) of g(y), then x is a subdifferential 
of g at y. 


3.3.13. If f : R — R is superlinear and even, then its Legendre transform is 
even, and g(y), y > 0 can be computed by restricting over x > 0: 


g(y) = sup(ry— f(r)),  y 20. 
«>0 
3.3.14. Let p > 1 and q > 1 satisfy (1/p) + (1/q) = 1, and let f = |a|?/p. 
Show, by direct computation, that the Legendre transform is g(y) = |y|%/q. 
Also show f’ and g’ are inverses. (Use Exercise 3.3.13.) 
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3.3.15. Let f(x) = el”! — 1. Show, by direct computation, that the Legendre 
transform is g(y) = |y| log |y| — |y| + 1 for |y| > 1 and g(y) = 0 for |y| < 1 
(Exercise 3.3.13). 


3.3.16. Suppose that f : R — R is superlinear and convex. Show that the 
Legendre transform g of f is superlinear and convex and f is the Legendre 
transform of g, i.e., (3.3.11) holds. Show also that this result is false if f is 
not assumed convex. 


3.3.17. If f is superlinear and convex, then for each y, the sup in the defini- 
tion (3.3.10) of g(y) is attained at « iff x is a subdifferential of g at y. Show 
also that this result is false if f is not assumed convex. 


3.3.18. If f : (a,b) — R is convex and differentiable, then f’ : (a,b) > R is 
continuous (recall f’ is then increasing). 


3.3.19. If f : R — R is superlinear and strictly convex, then its Legendre 
transform g is differentiable and g’ is continuous (start by showing that the 
sup in the definition (3.3.10) of g(y) is attained at a unique 2). 


3.3.20. If f : R > R is superlinear, differentiable, and strictly convex, then 
its Legendre transform g is superlinear, differentiable, and strictly convex, 
and f’ is the inverse of g’. 


3.3.21. Graph f, g, f’, and g’ where f and g are as in Exercise 3.3.15. 


3.3.22. Show that f(x) = e® is convex on R. Deduce the inequality a*b'~* < 
ta+(1—t)b valid fora >0,b>0,and0<t<1. 


3.3.23. If f is superlinear, smooth, and strictly convex, then the Legendre 
transform g of f is superlinear, smooth, and strictly convex iff f” (a) > 0 for 
all z € R. In this case, we have 


" _ 1 
g (y) ~~ Pies 


whenever y = f’(x) or equivalently x = g/(y). Also give an example of a su- 
perlinear, smooth, and strictly convex f with a non-smooth Legendre trans- 
form g. 


3.3.24. Suppose f : R — R is smooth. We say r is a root of f of order n 


if f(r) = f(r) =--- = f™ V(r) = 0. We say f has n roots if there are 
distinct reals r,,...,r,% and naturals ny,...,n, such that ny +--- +n, =n 
and r; is a root of f of order nj, 7 = 1,...,k. Show that if f has n roots 


in an interval (a,b), then f’ has n — 1 roots in the same interval (a,b) (Use 
Exercise 3.1.12). 
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3.3.25. Show that a degree n polynomial f has n roots iff 
f(a) = C(a@ — 11)" (a — rg)" ...(a@ — rR)"* 


for some distinct reals r;,...,7r, and naturals n,,...,n, satisfying ny +---+ 
Np = n (use induction on n for the only if part). 

3.3.26. If f is a degree n polynomial with n negative roots, then g(a) = 
x" f(1/a) is a degree n polynomial with n negative roots. 


3.3.27. Given positive reals a,,...,@,, not necessarily distinct, let the reals 
D1, -»», Pn be the coefficients? given by 


nm n 
(ea) (@42)--(2-4an) =2"+ (T)patt bot ("| )pna tm 


Let f(x) denote the polynomial on the right. Show the following. 

A. f has n negative roots. 

B. Differentiating fn —k—1 times (1 < k <n—1) yields a (k + 1)-degree 
polynomial g with k + 1 negative roots. 

C. Show that h(x) = x**++g(1/xr) is a degree k +1 polynomial with k + 1 
negative roots. 

D. Differentiating h k — 1 times, there are two roots for the quadratic poly- 
nomial 


1 
p(x) = 57! (pe—1 + 2pK@ + Pe412”) - 


Conclude that De > pr-1Pk+1 (Exercise 1.4.5). This result is due to Newton. 
3.3.28. With p,,...,pn as in the previous exercise and with aj,...,a@p, pos- 
itive, show that p; > pa! Dat pil ” with equality throughout only if all 
the a;’s are equal. This result is due to Maclaurin. 


3.3.29. Let a be an integer, and let p(a) be a polynomial with integers coef- 
ficients satisfying p(a) = 0. Set f(t, 7) = exp(tp(x)). Show that 


k 
f(t) = 4a) 


is a polynomial in ¢ with integer coefficients for all k > 0. Here derivatives 
are taken with respect to zx. 


3.3.30. Let a and b be naturals and set p = a/b € Q. For n > 0, let 


(bx) (a — bx)” 


; O<a2<p. 
n! = 


In(x) = 


Show that gS" (0) and gs (p) are integers for all k > 0 and n > 0. (Apply the 
previous exercise to p(z) = x(a —) and let g(t, x) = f(t, ba) = exp(tp(bz)).) 


3 p1,...,pn are the normalized elementary symmetric polynomials in a1,...,an. 
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3.4 Power Series 


A power series is a series of the form 
co 
Te 2 
) AnX =aAgptayx+agu +.... 
n=0 


Since this is a series involving a variable x, it may converge at some «x’s and 
diverge at other x’s. For example (§1.6), 


ltet+e’?+...d 1-2 


Note that the set of x’s for which this series converges is an interval centered 
at zero. This is not an accident (Figure 3.10). 


Theorem 3.4.1. Let S> anx” be a power series. Hither the series converges 
absolutely for all x, the series converges only at x = 0, or there is an R > 0, 
such that the series converges absolutely if |x| < R and diverges if |x| > R. 


To derive the theorem, let R = sup{|z| : }> a,x” converges}. If R = 0, 
the series converges only for « = 0. If R > 0 and |a2| < R, choose c with 
|x| < |c| < R and YC anc” convergent. Then {a,c”} is a bounded sequence 
by the nth term test, say ja,c”| < C, n > 0, and it follows that 


S > |anz”| = >> lane”|- |a/el” < CS |xr/e|” < 00, 


since |x/c]| < 1 and the last series is geometric. This shows that > anx” 
converges absolutely for all « in (—R, R). On the other hand, if R < co, the 
definition of R shows that 5> a,x” diverges for |x| > R. Finally, if R = oo, 
this shows that the series converges absolutely for all x. 


—R 0 R 


Fig. 3.10 Region of convergence of a power series 


Note that the theorem says nothing about z = R and « = —R. At these 
two points, anything can happen. For example, the power series 


ooo? 
=1--4+—-—}4... 
has radius R = 1 and converges for x = 1 but diverges for x = —1. On 


the other hand, the series f(—x) has R = 1 and converges for « = —1 but 
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diverges for x = 1. The geometric series has R = 1 but diverges at x = 1 and 
x = —1, whereas 


Dy eae 
1+ Pp + Dy + 3 puts 
has R = 1 and converges at x = 1 and x = —1. Because the interval of 


convergence is (—R, R), the number R is called the radius of convergence. 
If the series converges only at x = 0, we say R = O, whereas if the series 
converges absolutely for all x, we say R =o. 

Here are two useful formulas for the radius R. 


Theorem 3.4.2 (Root Test). Let S> anx” be a power series and let L de- 
note the upper limit of the sequence (|ap|!/"), 


L =limsup|a,|/”. 


n—->oo 


Then R=1/L where we take 1/0 = 00 and 1/oo = 0. 


To derive this, it is enough to show that DR = 1. If |x| < R, then > a,2” 
converges absolutely; hence, |a,,| - |z|" = |an2"| < C for some C (possibly 
depending on a); thus, |a,|!/"|a| < C1/". Taking the upper limit of both 
sides, we conclude that L|x| < 1. Since |x| may be as close as desired to R, 
we obtain ZR < 1. On the other hand, if « > R, then the sequence (ay2”) 
is unbounded (otherwise, the series converges on (—x,2) contradicting the 
definition of R). Hence, some subsequence of (|a,||x|") is bounded below by 
1, which implies that some subsequence of ({a,|!/"|a|) is bounded below by 
1. We conclude that L|2| > 1. Since |”| may be as close as desired to R, we 
obtain DR > 1. Hence, DR = 1. 


Theorem 3.4.3 (Ratio Test). Let (an) be a nonzero sequence, and suppose 
that 


— li 
p= Tim |an|/lan+1] 
exists. Then p equals the radius of convergence R of > aynx”. 


To show that p = R, we show that 5> a,x” converges when |z| < p and 
diverges when || > p. If |a| < p, choose c with |x| < |c| < p. Then |c| < 
lan|/|@n41| for n > N. Hence, 


An+1 
lenncia? | = |— », non. 
an 


Jel «lanz”| - =| < |an2"|- |= 
(6 Cc 


Iterating this, we obtain |Jayy2¢Nt?| < |a/cl?|anya%|. Continuing in this 
manner, we obtain |a,2”| < C|z/¢e\""%, n > N, for some constant C. Since 
>> |x/c|” converges, this shows that S> |anx”| converges. On the other hand, if 
|x| > p, then the same argument shows that aes > |a,x”| forn > N; 
hence, a,x" 4 0. By the nth term test, S> a,x” diverges. Thus, p = R. 
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In the previous section, we saw (3.3.9) that the series 


Co n 


exp(z) = >> = 


n=0 


converges to e” for x > 0. Since the region of convergence of a power series 
is an interval centered at zero, it follows that exp(x) converges absolutely for 
all real x. Now for x and y real, the Cauchy product exp(x) exp(y) (§1.7) of 
exp(x) and exp(y) is 


Yo H-Lay 


n=0i+j=n n=0 i+j=n 


(") aiyi = s _— = exp(x+y). 


n=0 


By selecting y = —x for x positive, e” exp(—x) = exp(x) exp(—x) = exp(0) = 
1; hence, exp(a) = e® for x negative as well. 

As an application, using the formula for the radius of convergence together 
with the fact that exp converges everywhere yields 


lim (n!)!/" = oo. 3.4.1 
Jim (n!) (3.4.1) 
This can also be derived directly. 
We have shown that 


e*=1+—4+—+—+..., xceR. (3.4.2) 


In particular, we have arrived at a series for e, 


1 1 1 
eS Tay or ap te 

In $1.6, we obtained 2.5 < e < 3. In fact, by the addition of sufficiently many 
terms, e can be computed to arbitrarily many places. 

What other functions can be expressed as power series? Two examples are 
the even and the odd parts of exp. 

The even and odd parts (§3.1) of exp are the hyperbolic cosine cosh and 
the hyperbolic sine sinh (these are pronounced to rhyme with “gosh” and 
“cinch” ). Thus, 


et —e-* fal ge 
sinh f= ae te Pies 
and 3 : 
zx — 
cosh x = a |: a ca 
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Since 
lee) co 
So [2"/n!| = $— |al"/n! = exp |el, 
n=0 n=0 


the series exp is absolutely convergent on all of R. By comparison with exp, 
the series for sinh and cosh are also absolutely convergent on all of R. From 
their definitions, cosh’ = sinh and sinh’ = cosh. 

Note that the series for sinh x involves only odd powers of x and the series 
for coshz involves only even powers of x. This holds for all odd and even 
functions (Exercise 3.4.7). 

To obtain other examples, we write alternating versions (§1.7) of the last 
two series obtaining 


: ee 2? 
mer=x2-=—-+a ... 
ce Wek 
and ; 4 
1 xv x 
osx =1—-—+—-.... 
saa a a 


These functions are studied in §3.6. Again, by comparison with exp, sin and 
cos are absolutely convergent series on all of R. However, unlike exp, cosh, 
and sinh, we do not as yet know sin’ and cos’. 

It turns out that functions constructed from power series are smooth in 
their interval of convergence. They have derivatives of all orders. 


Theorem 3.4.4. Let f(x) = Sl anx” be a power series with radius of con- 
vergence R > 0. Then 


ya = a, + 2aox + Baga? +... (3.4.3) 
n=1 


has radius of convergence R, f is differentiable on (—R, R), and f'(x) equals 
(3.4.3) for all x in (—R, R). 


In other words, to obtain the derivative of a power series, one needs only 
to differentiate the series term by term. To see this, we first show that the 
radius of the power series )\(n + 1)an412” is R. Here the nth coefficient is 
bn = (n+ L)an41, 80 


(n+1)/n 
lol? =(n+ 1 edad Or =(n+ no [lena |r) . 


so the upper limit of (\b,|!/") equals the upper limit of (|a,,|!/") since (n + 
1)/" -5 1 (83.3) and (n+1)/n > 1. 

Now we show that f’(c) exists and equals )> na,c"~!, where -R<c<R 
is fixed. To do this, let us consider only a single term in the series, i.e., let 
us consider «” with n fixed, and pick c real. Then by the binomial theorem 


(§3.3), 
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Thus, 


xz” —c” 1 =a {fn 1 
a a 
a Oa 
q=2 
ie n 
<be-d do (") lee 
jaa Sd 
is n 
<be-d0(™)lerta 
jaa Sd 
|z —¢| “A (n n—-j aj Iz —¢| n 
d = j d 


where we have used the binomial theorem again. To summarize, 


" jz —c 


=P 


xr — c” 


({c| +d)”, 0<|lxr—cel<d. (3.4.4) 


aL—C 


Now assume |c| < R and choose d < R-— |c|. Then by the triangle inequality, 
|x — c| < d implies |z| < R since |x| < |x —c| + |c| < d+ |c| < R. Assume 
also temporarily that the coefficients a, are nonnegative, a, > 0, n > 0. 
Multiplying (3.4.4) by a, and summing over n > 0 yields 


|x — ¢| 
d? 


< f(le| +d), 0<|x-—ce<d. 


«&—C 


2 = £0 _ nao 


n=1 
Letting z — cin the last inequality establishes the result when a, > 0,n > 0. 


If this is not so, we obtain instead 


jz —¢ 
=P 


g(|e| + d), 0<|a—cl<d. 


where g(x) = 07°, |an|a”, so the same argument works. 
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Since this theorem can be applied repeatedly to f, f’, f”,..., every power 
series with radius of convergence R determines a smooth function f on 
(—R, R). For example, f(x) = }> a,x” implies 


a= S- n(n —1)ay,z"~? 
n=2 


on the interval of convergence. More generally, on (—R, R), 


Co 


f (a) =) /n(n-1)...(n—f+Vanz", j = 0. 


n=) 


In particular, sin and cos are smooth functions on R (we already knew 
this for exp, cosh, and sinh). Moreover, differentiating the series for sin and 
cos, term by term, yields sin’ = cos and cos’ = — sin. 


Exercises 
3.4.1. Use the exponential series to compute e to four decimal places, justi- 
fying your reasoning. 


3.4.2. Show directly that lim, ».(n!)!/" = oo. (First, show that the lower 
limit is > 100.) 


3.4.3. Suppose that S° a,z2” and >> b,x” both converge to f(x) on (—R, R). 
Show that a, = b, for n > 0. Thus, the coefficients of a power series are 
uniquely determined. 


3.4.4. Show that the inverse arcsinh : R —> R of sinh: R — R exists and 
is smooth, and compute arcsinh’. Show that f(z) = cosh is superlinear, 
smooth, and strictly convex. Compute the Legendre transform g(y) (Exer- 
cise 3.3.16), and check that g is smooth. 


3.4.5. Compute the radius of convergence of 
> (-1)"a"/4"(nl)?. 
n=0 


3.4.6. What is the radius of convergence of 


Co 

! 
) ge Slag te +e 4a te ae 
n=0 
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3.4.7. Let f(x) = a9 +a14%+a9u7+... converge for —R < x < R. Then f is 
even iff the coefficients a2,_1, n > 1 of the odd powers vanish, and f is odd 
iff the coefficients az,, n > 1 of the even powers vanish. 


3.4.8. Let k > 0. Show that there are integers a;, 0 < 7 < k, such that 


Use this to show the sum 
is a natural. 


3.5 Taylor Series 


A function f is analytic at 0 with radius of convergence R if it can be ex- 
pressed as a convergent power series 


fo= dae" ‘lek, (3.5.1) 
n=0 


on (—R, R). A function is entire at 0 if it is analytic at 0 with infinite radius 
of convergence. For example, exp, sinh, cosh, sin, and cos are analytic at 0 
and are in fact entire. 

If f and g are analytic at 0 with radius of convergence R, then so is f +g. 
Since the Cauchy product (§1.7) of absolutely convergent series is absolutely 
convergent, if f and g are analytic at 0 with radius of convergence R, then 
so is fg. In general, however, it is not true that the quotient f/g is analytic 
at O with the same radius of convergence R. For example, the quotient of the 
entire functions f(z) = 1 and g(#) = 1— z is not entire. 

Let f be analytic at 0 with radius of convergence R. In §3.4, we saw that 
this implies the smoothness of f on (—R,R); the derivatives of f are also 
analytic at 0, with the same radius of convergence, and all of the form 


co 


fO(e) = So n(n—1)...(n-jf+ Vana”, j>0. (3.5.2) 


n=j 
Inserting a = 0 in these formulas, we obtain j!a; = f (0) for j > 0; hence, 


wa (n) " 
i@=>- : o) x” = f(0)+f'(O)r+ PO) a, (3.5.3) 


n 2! 
n=0 
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on (—R, R). The power series in (3.5.3) is the Taylor series of f centered at 
zero.* 

Let c be a real. A function f is analytic at c with radius of convergence R 
if f(x +c) is analytic at 0 with radius of convergence R. Equivalently, f is 


analytic at c with radius of convergence R if 


CO 


{a= >) aG—0 |x —c| < R. 


n=0 


A function is entire at c if it is analytic at c with infinite radius of convergence. 

Now suppose f is analytic at 0 with radius R and fix0 <c<a< R. 
Suppose also that the coefficients a, in (3.5.1) are nonnegative for all n > 0. 
Using (3.5.2), the binomial theorem, (1.7.10), and the substitution n = k+ J, 


fe) = ama” = Yam (Yo (“erie — 0) 
m=0 m=0 j=0 


Gm) j 
ss » (>: (*F) ansse wr") 
= > bs (") anc"4 (x — c)) 
~ 3 [Sn ing Ge (x — 
= 3 ve («— ce) (3.5.4) 


This last series is the Taylor series of f centered at c. We arrive at the 
following. 


Theorem 3.5.1 (Taylor series). Let f be analytic at 0 with radius of con- 
vergence R. Then, for all |c| < R, f is analytic atc with radius of convergence 
R-—|c|, and 


4 Also called the Maclaurin series of f 
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2 p(n)(¢ . 
f(x) = > f | gaan = f(c)t+f'(c)\(w@-c)+ (a—c)?+... (3.5.5) 


n 
n=0 
for |x —cl| < R—-|cl. 


This result follows as soon as we establish the validity of the result (3.5.4). 
We assumed a, > 0, n > 0, and 0 < c < x < R so that all terms in 
the computation leading to (3.5.4) are nonnegative, allowing us to apply the 
result (1.7.10) that nonnegative double series equal either iterated series. 

More generally, when |c| < R and |x —c| < R-— |c| and (a,) is signed, 
to establish the result (3.5.4), we apply the result (1.7.12) that summable 
double series equal either iterated series. To this end, we must first establish 
the summability of the double series 


2 (* : ’) an+je'(a — €). (3.5.6) 


(i,k) 


Let g(x) = So lanl”. If |c| < R and |x —c| < R-|cl, then 0 < |e] < 
\c| + |x —c| < R and (|a»p|) is nonnegative, so the computation leading to 
(3.5.4) is valid if we replace c by |c|, x by |c| + |a — cl, an by |an|, and f by 
g. Thus, we obtain 


k+j . 
ald +he—e)= > ("44 )iansllele— ef’. (85.7) 


(3,k) 


Since g(|c| + | — cl) is finite, this establishes the summability of (3.5.6); 
hence, we may go back and apply (1.7.12), establishing the validity of the 
result (3.5.4) when |c| < R and |x — c| < R— |e]. 

In particular, f entire at 0 implies f entire at c for all real c. Hence, for 
entire functions, we drop the qualifiers “at 0” and “at c” and simply call such 
functions entire. 

We now turn to the converse situation. Starting with a smooth function 
f on (—R, R), what can we say about its Taylor series (3.5.5)? It turns out 
the Taylor series of a smooth function may or may not converge for a given 
xz. When it does converge, its sum need not equal the function. For example, 
(Exercise 3.5.2), the function 


-1/x 
n= {5 z x>0, 


0 x <0, 


is smooth on R and satisfies f((0) = 0 for all n > 0. Thus, for all z, the 
Taylor series centered at zero converges, since it is identically zero. Hence, 
the Taylor series does not sum to f(«#), except when x < 0. 

The above example shows that no growth condition on the sequence 
(an) = (f'"(0)/n!) can guarantee the analyticity at zero of f. Nevertheless, 
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a necessary condition for the convergence on (—R, R) of the Taylor series 
centered at c is the boundedness of its terms: |f((c)|d"/n! < C,n > 1, 
for 0 < d < R-—|c|. This is strengthened to a sufficient condition in Exer- 
cise 3.5.8. 

To study the convergence of the Taylor series to the function, it makes 
sense to start with the n-th partial sum. Let f be continuous on (a,b) and 
assume f is n times differentiable on (a,b). For n > 0, the Taylor polynomial 
of degree n at c is 


f' (c) 


Tl (a —c)? +--+ 


Pn(a,c) = f(c) + f'(o\(@— e) + 


Define the (n + 1)st remainder 
Ry+i(a, c) = f(x) — pr(z,¢), a<xz<bd; 
for example, Ri(xz,c) = f(x) — f(c). If we write R,+1(x,c; f) to emphasize 


dependence on f, note that R},,1(2,c; f) = Rn(x,¢; f’). Now define hn41 : 
(a,b) > R by 


Rn+i(a, c) 
wel Gaara 
f° V(e), — 


Now assume f(”) is differentiable at c, ie., assume f+ (c) exists. Then 
hy+i is continuous when x 4 c. Applying l’Hopital’s rule n times, 


3 . Rn+1 (a, C) : Ri4i(2; c) 
lim hy = lim — pking ae eae 
Pe ee) — Be (@—o"ti/(n+l)! x ve(e—o)/nl 

Pree . fil 
= lim Ry (a,c; f!) = lim Ry-1(4,c; f ) 


ase (x—c)"/n! ee (xn —c)"—!/(n— 1)! 


we fin FA IM) 8 tig LOO = FOO _ pomsnyey, 


H bie a OF GG wc iB eee SF 
Thus, hy»+1 is continuous on (a, db). 


Theorem 3.5.2 (Taylor’s Theorem). Suppose f is continuous on (a,b) 
and n> 0. Suppose also f is n times differentiable on (a,b), and let p,(x,c) 
denote the Taylor polynomial at c of degree n. If f+ (c) exists at c in (a,b), 
then there is a continuous function” hn+1 : (a,b) > R. satisfying hn41(c) = 
f@*D (ec) and 


f(x) = Pa(z,c) + he —o)"tt, (3.5.8) 


5 Taylor’s theorem in §4.4 gives a useful formula for hn+1. 
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Moreover, if f+" exists on all of (a,b), then for some € between c and x, 
Rny+il(a,c) is given by the Cauchy form 


n+1 = perme) 


n! 


hn41(x) 


meme 


(x — €)"(x — ¢), 


Rr+i (es c) — 


and, for some n between c and x, Ry+1(x,c) is given by the Lagrange form 


An+1(x) (x _ c)?tt = fF) (n) (a _ oy aie 


The derivation of (3.5.8) is above. To obtain the Cauchy and Lagrange 
forms, differentiate the expression defining R,+1(x,c) once with respect to c. 
Then the sum collapses to 


fe) 


7 (a—c)”. 


Gofenti (2, ©) an 
Now apply the mean value theorem to t + R,+1(x,t) on the interval join- 
ing c to x. Since Ry41(x,x) = 0, this yields the Cauchy form. To obtain 
the Lagrange form, set g(t) = (x — t)"*!/(n + 1)! and fix x and c. Then 
g(t) = —(a — t)"/n!, g(x) = 0, and (remember ' here is with respect to 
t) Riai(a,t)/g/(t) = ft (t). So by the generalized mean value theorem, 
there is an 7 between c and x with 


) — Poti, 2) ~ Rnti(@,c) — Rngil®1) _ pny 
hn (2) aa) ~9(0) /@. 0° 


In particular, the Lagrange form implies that, for f smooth on R and any 
n > 1, for each x € R, there is an 7 between 0 and « with 


£0) 2 LOO om LOMO) st 


2! nl (n+ 1)! 30.0) 


f(a) = fO)+f' Oat 
Thus, a smooth function f can be approximated near 0 by an nth-degree 
polynomial with an error R,,(x,0) given by a certain expression, for every 
n>. 

Using (3.5.8) with f(x) = e*, we can show that e is irrational. Indeed, 
suppose that e were rational. Then there would be a natural N, such that 
nle € N for all n > N. So choose n greater than 3 and greater than NV, and 
write (3.5.8) for this n and x = 1 to get 


ee eee eee 
Se nl (n+l)! 


Then 


ead pas ase! il e” 
me=ni Tat + 
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So e"”/(n + 1) is a natural, which is false since 0 <7 < 1 and n > 3 imply 


e” e€ 


This contradiction allows us to conclude the following. 


Theorem 3.5.3. e is irrational. 


Our last topic is to describe Newton’s generalization of the binomial the- 
orem (§3.3) to nonnatural exponents. The result is the same, except that the 
sum proceeds to oo. For v real and n > 1, let 


Also let ea = 1. If v is a natural, then (*) is the binomial coefficient defined 
previously for 0 <n < v and (") =0 for n > v. These binomial coefficients 
grow at most polynomially as n — oo. 


Theorem 3.5.4. For v real, 


I) < (n+ jel), n> 0. 
n 


To see this, let N be the integer part of v-+1,so N-1<u< N, and 
consider the case N > 1. Since for n < N we have 0 < v(vu —1)---(v — 
n+1)< N! < NN-1 < (v+1)” < (n+ v)", we may assume n > N. 
Then v(v — 1)---(v — + 1) is the product of the positive factors v(v — 
1)---(vu—.N +1) and the negative factors (v— N)---(u—n+1). Hence, the 
absolute value of v(v — 1)--+(u—n+1) is no larger than N!(n — N)!; hence, 
I(2)| < Nin — N)!/n! < N! < (n+ 1)”. Now assume N < 0. Then all the 
factors in v(v — 1)---(v—n+1) are negative; hence, its absolute value is no 
larger than (n — N)!. Hence, 


(2) sR <a <n v 


n! 


The above estimate is not sharp; for example, one can see directly 
\(?)| <1. In fact, \(-?)| ~ 1/\/7n as n > oo (Exercise 5.5.3). 


Theorem 3.5.5 (Newton’s Binomial Theorem). For v real and |b| < |al, 


ore Se (“ome 


n=0 
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In particular, 
Sv 
1 —— e -1 1. 
(l-P@) > (n)? ee 


By setting « = b/a and multiplying by a” in the second form, the first 
form follows from the second. Let f(#) = (1+ 2)”. Since 


f™(e) = v(v—-1)...(v-n+1(14+2), 


f(z) /n! = (°)(1+ 2)". Hence, using Lagrange’s form of the remainder, 
for some 0 < 7 < a < 1 (see §3.3 for the limits), we obtain 


[Rn (x, c)| = ( )| (1 Se aime ln < (n oe |u|) lan tt 


vu 
n+1 
which goes to zero as n 00. 

To establish the result on (—1,0), apply the Cauchy form of the remain- 
der to f(x) = (1— 2)” and 0 < # < 1, ¢ = 0. Then f™(z)/n! = 


(-1)""1 (al —ax)’-"-1(n +1), so for some 0 < € < zx, we obtain 


n+l 


(n44)|(F=2) -0" 


< (n+ l)a(n +14 |v)! (=) d= 


Rall =(n-+1)|(, 4 ,)[-9r""e- reo) 


=(n+4+1)a 


Ifv>1,1-§? 1 <1. Ifu <1, (1-1 < (1—2)’"1. Hence, in both 
cases, (1—£)”~1! is bounded by [1+ (1—«)*~4], a fixed quantity independent 
of nm (remember that € may depend on n). Moreover, (a — €)/(1 — €) < 
(« — 0)/(1 — 0) = a. Hence, 


|Rn(x,¢)| < [1+ (1-2)? 7](n + 1)(n +1 t |f)lar*?, 


which goes to zero as n_/ 00. 


Exercises 


3.5.1. Suppose that f : R — R is nonnegative, twice differentiable, and 
f"(c) < 1/2 for all c € R. Use Taylor’s theorem to conclude that | f’(c)| < 
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3.5.2. Let 


By induction on n > 1, show that 


A. h(—)) (x) = h(x)R, (x), « > 0, for some rational function R,, and 
B. A) (2) = 0, 2 <0. 


Conclude that h (Figure 3.11) is smooth on R. 


0 


Fig. 3.11 Graph of the function h in Exercise 3.5.2 


3.5.3. Show that 
1 
vVi-@ 
for |a| <1. 
3.5.4. Compute the Taylor series of log(1 + x) centered at 0. 
3.5.5. Show that 


log(1 + a) 1\ 5 1 1\ 35 1 1 1\4 
———_ =2z-(l+- l+st+s l+ts+ster soe 
Tas x = 5 e+i{ilt+ 2° 3 x 273 si 7 uo + 
for |x| < 1 by considering the product (§1.7) of the series for 1/(1+ x) and 

the series for log(1 + x). 


3.5.6. Let f : R > R be twice differentiable with f, f’, and f” continuous. If 
f(0) =1, f’(0) = 0, and f”(0) = q, use Taylor’s theorem and Exercise 3.2.3 
to show that 


lim [f(«/Vn)]" = exp(qx?/2). 


noo 
This result is a key step in the derivation of the central limit theorem. 


3.5.7. Compute 


by writing e’ =1++t+ t7h(t)/2. 
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3.5.8. Let f : (—R,R) > R be smooth and let |c| < R, d < R—|c|. Suppose 
there is real C > 0 with 
(n) a" 
\f (2)|>— SC, n>1,|a—¢| <d. 
n! 
Show that the Taylor series of f centered at c converges to f(a) for |a—c| < d. 


3.5.9. Suppose f(x) = }> a,x” is a power series on (—R, R) and let g(x) = 
Yo |an|z”. Show that for |c| < R and d< R-|cl, 


— plntl 
fle) — pala.) < FTI gle +a), mB 0,Je-e <a 


3.5.10. Show that 
ee. _ © 1)"(2njt _ (=1)" ee n>0. 


nr 
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The trigonometric functions or circular functions sine and cosine may be 
defined either by power series or by measuring the length of arcs along the 
unit circle. As the second method involves integration, it cannot be discussed 
until Chapter 4 (Exercise 4.4.29). Because of this, we base our development 
here on power series, which is what we have at hand. 

In the previous section, we introduced alternating versions of the even and 
the odd parts of the exponential series, the sine function, and the cosine 
function, 


x? x? xe 
Sn Sap Fey 7 ’ 
and 
of x? x x 
mee ea ae 


Since these functions are defined by these convergent power series, they are 
smooth everywhere and satisfy (§3.4): 


(sin)! = cosx 
(cosx)’ = —sing, 
sin0 = 0, and cosO = 1. The sine function is odd and the cosine function is 


even, 
sin(—x) = — sina, 


and 
cos(—2) = cosa. 
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Since 
. ! ; 
(sin? x + cos” x) = 2sinxcosx + 2cosz(—sinz) = 0, 
sin? + cos? is a constant; evaluating sin? x + cos? x at « = 0 yields 1; hence, 
A} 2, 
sin* x + cos*x = 1 (3.6.1) 


for all x. This implies | sinz| < 1 and |cosa| < 1 for all a. If a is a critical 
point of sinz, then cosa = 0; hence, sina = +1. Hence, sina = 1 at any 
positive local maximum a of sin z. 

Let f, g be differentiable functions satisfying f’ = g and g’ = —f on R. 
Now the derivatives of fsin+gcos and f cos—gsin vanish; hence, 


f(x) sina + g(x) cos x = g(0), 


and 
f(x) cosa — g(x) sina = f(0) 


for all x. Multiplying the first equation by sina and the second by cos and 
adding, we obtain 
f(x) = g(0)sinz + f(0) cosa. 


Multiplying the first by cos x and the second by — sina and adding, we obtain 
g(x) = g(0) cosx — f(0)sina. 


Fixing y and taking f(x) = sin(w + y) and g(x) = cos(# + y), we obtain the 
identities 


sin(a + y) = sinxzcosy + cosxsiny, 


cos(a + y) = cosx cosy — sin ax sin y. (3.6.2) 


If we replace y by —y and combine the resulting equations with (3.6.2), we 
obtain the identities 


sin(x + y) + sin(a — y) = 2sinax cosy, 
sin(a + y) — sin(a — y) = 2cosasiny, 
cos(z — y) — cos(a + y) = 2sinasiny, 
cos(x — y) + cos(a% + y) = 2cos x cos y. (3.6.3) 
For 0 < x < 3, the series 
; ee og gt 
x—sing = — = 


116 3 Differentiation 


is alternating with decreasing terms (Exercise 3.6.8). Hence, by the Leibnitz 
test (§1.7), 


al 
e— sing < 0<2<3, 
and ' : 
a—sinz> = - TS, 0<a2<3. 


Inserting x = 1 in the first inequality and x = 3 in the second, we obtain 


5 21 
sin0 = 0, sin] 2 &: sin3 < rh 


Hence, there is a positive b in (0,3), where sin b is a positive maximum, which 
gives cos(b) = 0 and sin(b) = 1. Let a = inf{b > 0: sinb = 1}. Then the 
continuity of sinz implies sina = 1. Since sin0 = 0, a > 0. Since sin is a 
specific function, a is a specific real number. 

For more than 20 centuries, the real 2a has been called “Archimedes’ 
constant.” More recently, for the last 200 years, the Greek letter 7 has been 
used to denote this real. Thus, 


and 


As yet, all we know about 7 is 0 < 1/2 < 3. In §5.2, we address the issue of 
computing 7 accurately. 

Since the slope of the tangent line of sinx at x = 0 is cos0 = 1, sing > 
0 for « > 0 near 0. Since there are no positive local maxima for sinz in 
(0,7/2) and sinO = 0, we must have sinz > 0 on (0,2/2). Hence, cos is 
strictly decreasing on (0,7/2). Hence, cosx is positive on (0,7/2). Hence, 
sin xz is strictly increasing on (0,7/2). Moreover, since (sina)” = — sina and 
(cosx)” = —cosa, sina and cos are concave on (0,7/2). This justifies the 
graphs of sinx and cos in the interval [0, 7/2] (Figure 3.12). 

Inserting y = 7 and replacing x by —z in (3.6.2) yields 


sin(t —x“) =sing, 

cos(7 — 7) = — cosa. (3.6.4) 
These identities justify the graphs of sinxz and cos in the interval [7/2, 7]. 
Hence, the graphs are now justified on [0,7]. Replacing x by —z in (3.6.4), 


we obtain 
sin(a + 7) = —sing, 


and 


cos(a + 7) = — cosa. 
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These identities justify the graphs of sin x and cos z on (0, 27]. Repeating this 
reasoning once more, 


sin(x + 27) = sin(w+a+7) =—sin(a# +7) =sing, 


and 


cos(a + 27) = cos(x +7 +7) = —cos(a# + 7) = cosa, 


showing that 27 is a period of sin x and cos. In fact, repeating this reasoning, 
sin(x + 27n) = sina, ne Z, 


and 


cos(a + 27n) = cosa, ne Z, 


showing that every integer multiple of 27 is a period of sin and cosa. If sinx 
or cosx had any other period p, then by subtracting from p an appropriate 
integral multiple of 27, we would obtain a period in (0, 27), contradicting the 
graphs. Hence, 27Z is the set of periods of sin and of cos z. 

If we set x = y in (3.6.2), we obtain 


sin(2x) = 2sinxcosz, 


and 
cos(2x) = cos? x — sin? x. 


By (3.6.1), the second identity implies 


1 
See + cos(22) . 
2 
and , 5 
n= cos ay 


Fig. 3.12 The graphs of sine and cosine 


By the inverse function theorem, sinx has an inverse on [—7/2, 7/2], and 
cos x has an inverse on [0, 7]. These inverses are arcsin : [—1, 1] > [—7/2, 1/2] 
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and arccos : [—1,1] — [0,7]. Since cosa > 0 on (—a/2,7/2), by (3.6.1), 
cos(arcsinz) = V1—<2? on [1,1]. Similarly, since sina > 0 on (0,7), 
sin(arccos 7) = V1 — #? on [—1,1]. Thus, the derivatives of the inverse func- 
tions are given by 


1 1 


ing)! = ————_ = ———,s_ _-1<2<1, 
een) cos(aresinz) 4/1 — x2 . 
and i 1 
(arccos x)’ = —————___. = ————— -l<a<l. 


—sin(arccosr) /1— 22’ 


As an application, 


1 1 1 
(Qaredins/a) = 22 —— ee, 
Lage 2/e  ./x(1—2) 

Now we make the connection with the unit circle. The unit circle is the 
subset {(a, y) : 2? + y? = 1} of R?. The interior {(2, y) : 27 + y? < 1} of the 
unit circle is the (open) unit disk. 


Theorem 3.6.1. If (x,y) is a point on the unit circle, there is a real 0 with 
(x,y) = (cos6,sin@). If é is any other such real, then 6 — ¢ is in 2nZ. 


Since x? + y? = 1, |a| < 1. Let 6 = arccosx. Then sin? @ + cos? = 1 
implies y = +sin6. If y = sin@, we have found a real 6, as required. Oth- 
erwise, replace 9 by —@. This does not change cos @ = cos(—6@), but changes 
sin? = —sin(—6). For the second statement, suppose that (cos6,sin@) = 
(cos ¢, sind). Then 


cos(6 — d) = cosOcos¢+ sin @ sing = 1. 


Hence, @ — ¢ is an integer multiple of 27. 

By the Theorem, we may subtract an integer multiple of 27 from # without 
affecting (x,y) and thus assume 0 < @ < 2r. In this case, it turns out that 
6 equals the length of the counterclockwise circular arc (Figure 3.13) joining 
(1,0) to (a, y) (Exercise 4.4.28). By definition, the real @ is the angle cor- 
responding to (x,y). More generally, the angle between (rcos6,rsin@) and 
(tcos¢,tsing), withO <@0<¢< 2z2,r>0,t> 0, is defined to equal ¢ — 0. 

Given a real 0, a rotation by 0 is the map R: R? — R? given by R(x, y) = 
(x cos6—ysin 6, x sin0+y cos 0). Given (a, b), a translation by (a, b) is the map 
T : R? > R? given by T(z, y) = (a + a,y + 6). Given two points A = (a, b) 
and B = (c,d) in R?, the distance between them is 


d(A,B) = /(a—o? +(b—d)?. 


This is also (by definition) the length of the line segment joining the two 
points. 
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(1,0) (1,0) 


(ay) 


Fig. 3.13 The angle @ corresponding to (z, y) 


It then follows that distance is translation and rotation invariant (Ex- 
ercise 3.6.2). From this follows the Pythagoras theorem for right triangles 
(Exercise 3.6.4). 

In this section, we defined sin and cos as power series; then we established 
the connection with the unit circle. It is possible to reverse this point of 
view and first define sin and cos from the unit circle and then derive their 
properties and Taylor expansions (see Exercise 4.4.29). Because of this, the 
trigonometric functions are also called the circular functions. 

Going back to the main development, the tangent function, tanz = 
sinx/ cosa, is smooth everywhere except at odd multiples of 7/2 where the 
denominator vanishes. Moreover, tan x is an odd function, and 
sin(g+a)  —sing 


t = 
ane ay cos(t+7)  —cosx 


So m is the period for tanz. By the quotient rule 


(sin x)! cos x — sin x(cos 2)’ 1 


(tan x)’ = —n/2<a2< 7/2. 


(cos x)? ~ cos? x’ 
Thus, tan x is strictly increasing on (—7/2, 2/2). Moreover, 


tan(7/2—) =o, 
tan(—1/2+) = —oo, 


Lt \" at 
(anay” = ( ) — 


cos? x cos? x 


and 


Thus, tan is convex on (0,7/2) and concave on (—7/2,0). The graph is as 
shown in Figure 3.14. 

By the inverse function theorem, tan x has an inverse on (—7/2, 7/2). This 
inverse, arctan : (—00, 00) + (—1/2,7/2), is smooth, and 
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arctan x 


Fig. 3.14 The graphs of tan x and arctan x 


2 


(arctan x)’ = 1/cos(arctana)~* = cos?(arctan 2). 


Since cosa is positive on (—7/2,7/2), dividing (3.6.1) by cos?.z, we have 
tan? z + 1 =1/cos? x. Hence, cos? z = 1/(1 + tan? xz). Thus, 


fh 


arcta = ——. 
(arctan x) ine 


It follows that arctanz is strictly increasing on R. Since (arctanz)” = 
—2x/(1+ x7)”, arctanz is convex for z < 0 and concave for x > 0. More- 
over, arctanoo = 7/2 and arctan(—oo) = —7/2. The graph is as shown in 
Figure 3.14. 

Often, we will use the convenient abbreviations secx = 1/cosz, cscx = 
1/sinaz, and cot x = 1/tanz. These are the secant, cosecant, and cotangent 
functions. For example, (tan x)! = sec? x, and (cot x)! = — csc? x. 

If t = tan(0/2), we have the half-angle formulas 


: 2t 
sind = er) 
— ft? 
cos 6 = pe 
and 53 
tan? = FB 
Exercises 


3.6.1. Derive the half-angle formulas. 

3.6.2. Let T be a translation, R a rotation, and A = (a,b) and B = (c,d) 
points in the plane. Show that d(A,B) = d(T(A),T(B)) and d(A,B) = 
d(R(A), R(B)). 
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3.6.3. With R(x, y) = (xcos@ — ysin@,xsin@ + ycos 0), show that 

Ri(cos ¢, sin ¢) = (cos(¢ + A), sin(d + 4)), 
justifying the term “rotation.” 


3.6.4. Let A, B, C be three vertices of a triangle in the plane. Then the 
angle at C' is that obtained after translating the vertex C to the origin. The 
triangle is a right triangle if the angle at C is 7/2. Suppose a, b, c are the 
lengths of the sides of the triangle formed by A, B, C, with c the length of the 
hypotenuse, i.e., the length of the side across from C’. Show that a? +6? = c?. 


3.6.5. Let f(x) = xsin(1/x), « £0, f(0) = 0. Show that f is continuous at 
all points, but is not of bounded variation on any interval (a, b) containing 0. 


3.6.6. Let f(x) = 2? sin(1/x), « £0, f(0) = 0. Show that f is differentiable 
at all points with |f’(x)| < 1+ 2\z|. Show that f is of bounded variation on 
any bounded interval. 


3.6.7. Let f(x) = 2+ 2x7 sin(1/xr), x #0, and f(0) = 0. Show that f is not 
injective on (0,¢) for any € > 0, but f’(0) = 1. This shows the conclusion of 
the IFT does not hold with the weakened hypothesis f’(0) 4 0. 


3.6.8. Show that, for 0 < x < 3, the series 


a 


3! 5! 7! 
has decreasing terms. 


3.6.9. Use the trigonometric identities to compute the sine, cosine, and tan- 
gent of 7/6, 7/4, and 7/3. 


3.6.10. Show that cos(7/9) cos(27/9) cos(47/9) = 1/8. 


3.6.11. Use the trigonometric identities to show 2 cos(1/5) is the golden mean 
(1+ /5)/2 (Exercise 1.7.8). 


3.6.12. Use (3.6.2) and induction to show that 
1+ 2cosz + 2cos(2xz) +---+2cos(nz) = eee 

forn >1 and « ¢ 2nZ. 

3.6.13. Show that 2 cot(2x) = cot x — tana. 

3.6.14. Show that 


(a? — 2a cos@ +1) - (2? — 2x cos(m — 8) + 1) = (2* — 2x? cos(20) +1). 
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Use this to derive the identity 


a" — 1 = (2? —1) Th a* — 2x cos(kr/n) + 1) (300.0) 


for n = 2,4,8,16,.... 


3.6.15. Use the Dirichlet test (§1.7) to show that °°, cos(na)/n is conver- 
gent for x ¢ 2nZ. 


3.6.16. Let f : R > R be continuous. A period of f is a real p > 0 satisfying 
f(a +p) = f(a) for all x. Let P be the set of periods. The period of f is by 
definition p* = inf P. For example, the period of sin and cos is 27. Show that 
p* =0 implies f is constant. 


3.7 Primitives 


Let f be defined on (a,b). A differentiable function F' is a primitive of f if 
F'(#) = F(x), a<a2<b. 


For example, f(x) = x° has the primitive F(x) = x+/4 on R since (a4/4)! = 
(Ae?) /4 =o. 

Not every function has a primitive on a given open interval (a, b). Indeed, 
if f has a primitive F' on (a,b), then by Exercise 3.2.8, f = F” satisfies the 
intermediate value property. Hence, f((a,b)) must be an interval. 

Moreover, Exercise 3.1.6 shows that the presence of a jump discontinuity 
in f at asingle point in (a, b) is enough to prevent the existence of a primitive 
F on (a,b). In other words, if f is defined on (a,b) and f(c+), f(c—) exist 
but are not both equal to f(c), for some c € (a,b), then f has no primitive 

n (a, b). 

Later (§4.4), we see that every continuous function has a primitive on any 
open interval of definition. 

Now we investigate the converse of the last statement: To what extent 
does the existence of a primitive F' of f determine the continuity of f? To 
begin, it is possible (Exercise 3.7.7) for a function f to have a primitive 
and to be discontinuous at some points, so the converse is, in general, false. 
However, the previous paragraph shows that such discontinuities cannot be 
jump discontinuities but must be wild, in the terminology of §2.3. In fact, 
it turns out that, wherever f is of bounded variation, the existence of a 
primitive forces the continuity of f (Exercise 3.7.8). Thus, a function f that 
has a primitive on (a,b) and is discontinuous at a particular point c € (a,b) 
must have unbounded variation near c, i.e., must be similar to the example 
in Exercise 3.7.7. 
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From the mean value theorem, we have the following simple but funda- 
mental fact. 


Theorem 3.7.1. Any two primitives of f differ by a constant. 


Indeed, if F and G are primitives of f, then H = F — G is a primitive of 
zero, i.e., H'(x) = (F(a#)—G(az))’ = 0 for alla < x < b. Hence, H(x)—A(y) = 
H'(c)\(a—y) =0fora<a<y<ob,ie., H is a constant. 

For example, all the primitives of f(x) = 2° are F(x) = 2*/44 C with 
C areal constant. Sometimes, F' is called the antiderivative or the indefinite 
integral of f. We shall use only the term primitive, and symbolically, we write 


F(z) = | Hvyae (3.7.1) 


to mean—no more, no less—F” (x) = f(a) on the interval under consideration. 
The reason for the unusual notation (3.7.1) is due to the connection between 
the primitive and the integral. This is explained in §4.4. With this notation, 


[f@e=f@ 


is a tautology. As a mnemonic device, we sometimes write d[f(x)] = f’(a)dx. 
With this notation, f and d “cancel”: [ d[f(x)] = f(z). 

Based on the derivative formulas in Chapter 3, we can list the primitives 
known to us at this point. These identities are valid on any open interval 
of definition. These formulas, like any formula involving primitives, can be 
checked by differentiation. 


etl 
*dxz = —— —1 
fe ee aay a#-—l, 


3 1 
a* dx = a”, a>0, 
a 
[coszae = sing, 
[suede =— cosa, 
[ve adx = tana, 
[ose xdx = —cotx, 


dx = arcsing, 


l= 
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1 
/ dx = arctan 7. 


and 


14+? 


From the linearity of derivatives, we obtain the following. 


Theorem 3.7.2. If f and g have primitives on (a,b) and k is a real, so do 
f+gandkf, and 


[ii@+o@ar= f tayact | (x) ae, 


and 


fat@ ie kf He) de 


From the product rule, we obtain the following analog for primitives of 
summation by parts (§1.7). 


Theorem 3.7.3 (Integration By Parts). If f and g are differentiable on 
(a,b) and f'g has a primitive on (a,b), then so does fg’, and 


[ fost e) ax = Foate)— ff" @ale) ax. 


To see this, let F be a primitive for f’g. Then (fg—F) = f’g+ fg’ — 
F' = fg’, so fg — F is a primitive for fa’. We caution the reader that 
integration is taken up in 84.3. Here this last result is called integration by 
parts because of its usefulness for computing integrals in the next chapter. 
There are no integrals in this section. 

From the chain rule, we obtain the following. 


Theorem 3.7.4 (Substitution). If g is differentiable on (a,b), g[(a,b)] C 
(c,d) and [{ f(x) dx = F(x) on (c,d), then 


| flo(o)lg'(@) de = Flg(e)), nen: 


We work out some examples. It is convenient to allow the undefined symbol 
dx to enter into the expression for f and to write, e.g., [ dx instead of [ 1 dx 
and f{ dx/x instead of f(1/x)dz. 

Substitution is often written as [ f[g(x)|g/(x)dxz = f f(u)du, u = g(z). 
For example, 


COS © 
peoteae = [| ae 
sin x 


dsin x du ; 
= — = log |u| = log | sina]. 
u 


sin x 
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2x+1 22 1 
dx = dx+ | —<d 
[reat [eee 
d(1 +x”) 


T+ a2 
= log(1 + x”) + arctan z. 


Similarly, 


+ arctan x 


Particularly useful special cases of substitution are 


fi) dx = F(x) implying [te +a)dx = F(ax+a), 


1 
[t@ dx = F(x) implying | flax) dx = af (2): a #0. 
We call these the translation and dilation properties. Thus, for example, 
1 ax 
[ettar =< ; a#0. 
a 


Integration by parts is often written as [udv = uv — f vdu. If we take 
u = loga, dv = dz, then du = u'dx = dx/x, v = x, so 


Jroeeds = clogs ~ | 2(d0/2) =sloge— f dx = clogs ~«. 


Similarly, 
Pe 
[ vvoeede = loge f = (a/x) 
2 Pe 2 ip? 
= Fioge— f Sade = toge — =. 


By a trigonometric formula (§3.6), 


1 
feostnde = [PCD a 24> f cos(2e) ae 


_« , sin(2z)  2x2+sin(2z) 


~ 9 4 4 
Since 4a(1 — x) = 4x — 4a? = 1-— (22-1), 


d(2x — 1) 


[a5 -/ ae -/ SE 


= arcsin(2x — 1) 
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by translation and dilation. Of course, we already know (§3.6) that another 
primitive is 2 arcsin \/z, so the two primitives must differ by a constant (Ex- 
ercise 3.7.9). The reduction 4x — 42? = 1 — (2x — 1) is the usual technique 


of completing the square. 
To compute f V1 — 2? dz, let x = sin, dx = cos6 d0. Then 


[vi Bae= | Vim sin? 6.008008 


= [cos 6 dé 


i 720 + sin(28)] 


1 
— 3 (9 + sin 8 cos 8) 
iL 
5 (aresin x t+avV1— 2?) : 


Alternatively, let u = /1— 22, dv = dz. Then du = —xdz/V1— 27, v= 2. 
So 


[V1 P ae = at f (ede) 
2dx 
= nd z 
es oe 
_— raa+ f Se, f dx 
V1— 2? V1— x? 
=2V1-a— f 1 Pae + aresinz, 


Moving the second term on the right to the left side, we obtain the same 
result. 


dx : 
To compute | ——>s, write 
1— 2? 


to get 


1 1 1l+2 
—~ = — llog(1 — log(1 — =-] : 
St =F log(t + 2) ~ tog(t ~ 2)] = 5 toe (>) 


If f is given as a power series, the primitive is easily found as another 
power series. 


3.7 Primitives 127 
Theorem 3.7.5. If R > 0 is the radius of convergence of 

f(z) =ap +a," +a9n7 +..., 
then R is the radius of convergence of Y>anz"t'/(n +1), and 


aye? agx? 


[ fod = aoe + ke ae (3.7.2) 


on (—R, R). 


To see this, one first checks that the radius of convergence of the series in 
(3.7.2) is also R using n!/” — 1, as in the previous section. Now differentiate 
the series in (3.7.2) obtaining f(x). Hence, the series in (3.7.2) is a primitive. 


For example, using the geometric series with —zx replacing z, 


1 
=l-at+a2?—-2z+..., ja] <1. 
1l+x 
Hence, by the theorem, 
es 
log(L+2)=2- tyres ja| <1. (3.7.3) 


Indeed, both sides are primitives of 1/(1 +), and both sides equal to zero 
at x = 0. Similarly, using the geometric series with —x? replacing 2, 


1 
i" v+a4 i re |x| <1. 


Hence, by the theorem, 


x3 x x 


tang =2—-—+——+..., ii 3.7.4 
arctan’ =x — > + = 7 |z| < ( ) 


ou 
“I 


This follows since both sides are primitives of 1/(1+2?) and both sides vanish 
at x = 0. To obtain the sum of the series (1.7.5), we seek to insert « = 1 
in (3.7.4). We cannot do this directly since (3.7.4) is valid only for |r| < 1. 
Instead, we let s,,(a) denote the nth partial sum of the series in (3.7.4). Since 
this series is alternating with decreasing terms (§1.7) when 0 < a <1, 


Son(x) < arctana < son_1(2), n> 1. 


In this last inequality, the number of terms in the partial sums is finite. 
Letting x 7 1, we obtain 


Son(1) < arctan l < sg,_1(1), n> 1. 
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Now letting n 7 co and recalling arctan 1 = 7/4, we arrive at the sum of the 


Leibnitz series 
T 1 1 i. 1 1 m 
4 3° 2b. 


first discussed in §1.7. In particular, we conclude that 8/3 < a < 4. In §5.2, 
we obtain the sum of the Leibnitz series by a procedure that will be useful 
in many other situations. 

The Leibnitz series is “barely convergent” and is not useful for computing 
mt. To compute 7, the traditional route is to insert « = 1/5 and x = 1/239 in 
(3.7.4) and to use Machin’s 1706 formula 


A og F 1 ; 1 
a = 4 arc a arc A 550" 


Exercises 
3.7.1. Compute 1@ cos x dx. 


3.7.2. Compute ee dz. 


z+1 
3.7.3. Compute / ————dz. 
P V1— x? 
arctan x 
3.7.4. Compute = 


3.7.5. Compute J 008:r)%ae. 


3.7.6. Compute / V1l—e dz. 


3.7.7. Let F(x) = x sin(1/r), « 4 0, and let F(0) = 0. Show that the 
derivative F’(x) = f(a) exists for all x, but f is not continuous at x = 0. 


3.7.8. If f is of bounded variation and F’ = f on (a,b), then f is continuous. 


3.7.9. Show directly (i.e., without derivatives) 2 arcsin /x = arcsin(2x—1)+ 
m/2. 


3.7.10. Show 


. 1 2 1 3 
arcsInZ = “+ 5° 
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3.7.11. If f is a polynomial and F(x) = f(x) — f(x) + f(x) —..., then 
[t@ sinz dx = F’(x) sinx — F(x) cos x. 


3.7.12. Show that 
jana B) tana + tanb 
n = ———§€—__.. 
- 1—tanatanb 


Use this to derive Machin’s formula. 


3.7.13. Use Machin’s formula and (3.7.4) to obtain 7 = 3.14... to within an 
error of 10~?. 


3.7.14. Simplify arcsin(sin 100). 


—4 
3.7.15. Compute / _—* dz. 
1— 2? 


4/2—4 
3.7.16. Compute [A= * a, 
a2 — V2Qr+1 
3.7.17. Show that 
log2=1 : + a + 
og 2 = oe 


following the procedure discussed above for the Leibnitz series. 


Chapter 4 
Integration 


4.1 The Cantor Set 


The subject of this chapter is the measurement of the areas of subsets of 
the plane R? = R x R. The areas of elementary geometric figures, such as 
squares, rectangles, and triangles, are already known to us. By known to us 
we mean that, e.g., by defining the area of a rectangle to be the product of 
the lengths of its sides, we obtain quantities that agree with our intuition. 
Since every right-angle triangle is half a rectangle, the areas of right-angle 
triangles are also known to us. Similarly, we can obtain the area of a general 
triangle. 

How does one approach the problem of measuring the area of an unfamiliar 
figure or subset of R?, say a subset that cannot be broken up into triangles? 
For example, how does one measure the area of the unit disk 


D={(z,y): a? +y* <1}? 


One solution is to arbitrarily define the area of D to equal whatever one 
feels is right. The Egyptian book of Ahmes (~ 3,900 years ago) states that 
the area of D is (16/9)?. In the Indian Sulbastras (written down ~ 2,500 
years ago), the area of D is taken to equal (26/15)?. Albrecht Diirer (~ 500 
years ago) of Nuremburg solved a related problem which amounted to taking 
the area of D to equal 25/8. 

Which of these answers should we accept as the area of D? If we treat 
these answers as estimates of the area of D, then in our minds, we must have 
the presumption that such a quantity—the area of D—has a meaningful 
existence. In that case, we have no way of judging the merit of an estimate 
except by the quality of the reasoning leading to it. 
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Realizing this, by reasoning that remains perfectly valid today, Archimedes 
(~ 2,250 years ago) carefully established 


= < area(D) < ae 
In §4.4, we show that area(D) = 7, where 7, “Archimedes’ constant,” is the 
real number defined in §3.6. 

At the basis of the Greek mathematicians’ computations of area was the 
method of echaustion. This asserted that the area of a set A C R? could be 
computed as the limit of areas of a sequence of inscribed sets (A,,) that filled 
out more and more of A as n 7“ oo (Figure 4.1). Nevertheless the Greeks 
were apparently uncomfortable with the concept of infinity and never used 
this method as stated. Instead, for example, in dealing with D, Archimedes 
used inscribed and circumscribed polygons with 96 sides to obtain the above 
result. He never explicitly passed to the limit. It turns out, however, that the 
method of exhaustion is so important to integration that in §4.5 we give a 
careful derivation of it. 


Fig. 4.1 The method of exhaustion 


Now the unit disk is not a totally unfamiliar set to the reader. But, if we 
are presented with some genuinely unfamiliar subset C, the situation changes, 
and we may no longer have any clear conception of the area of C. If we are 
unable to come up with a procedure leading us to the area, then we may be 
forced to reexamine our intuitive notion of area. In particular, we may be led 
to the conclusion that the “true area” of C may have no meaning. Let us 
describe such a subset. 

Let Co denote the compact unit square [0,1] x [0,1]. Divide Co into nine 
equal subsquares and take out from Cp all but the four compact corner sub- 
squares. Let C, be the remainder, i.e., the union of the four remaining com- 
pact subsquares. Repeat this process with each of the four subsquares. Divide 
each subsquare into nine equal compact subsubsquares and take out, in each 
subsquare, all but the four compact corner subsubsquares. Call the union of 
the remaining sixteen subsubsquares C2. Continuing in this manner yields a 
sequence 

Co DCL DCyg>D.... 
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The Cantor set is the common part (Figure 4.2) of all these sets, i.e., their 


intersection 
foe) 
C= \ Cnr: 
n=1 


At first glance, it is not clear that C is not empty. But (0,0) € C! Moreover, 
the sixteen corners of the set C; are in C. Similarly, any corner of any sub- 
square, at any level, lies in C. But the set of such points is countable (§1.7), 
and it turns out that there is much more: There are as many points in C' as 
there are in the unit square Co. In particular C' is uncountable. 


| | LC] OO fi 
OO fi 
i eee 
OO fi 

C1 


Co C2 


Fig. 4.2 The Cantor set 


To see this, recall the concept of ternary expansions (§1.6). Let a € [0, 1]. 
We say that 
a= .a\a2... 


is the ternary expansion of a if the naturals a, are ternary digits 0, 1, 2, and 


Co 
a= y Ggd” 
n=1 


Now let (a,b) € Co, and let 
a= .a,aQ... 


and 
b= .bybo... 


be ternary expansions of a and b. If ay #4 1 and b; # 1, then (a,b) is in Ci. 
Similarly, in addition, if ag # 1 and bo ¥ 1, (a,b) € Cy. Continuing in 
this manner, we see that, if a, 4 1 and b, 4 1 for all n > 1, (a,b) € C. 
Conversely, (a,b) € C implies that there are ternary expansions of a and b 
as stated. Thus, (a,b) € C iff a and b have ternary expansions in which the 
digits are equal to 0 or 2. 

Now although some reals may have more than one ternary expansion, a 
real a cannot have more than one ternary expansion .a;a2... where a, 4 1 
for all n > 1 because any two ternary expansions yielding the same real must 
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have their nth digits differing by 1 for some n > 1 (Exercise 1.6.2 treats the 
decimal case). Thus, the mapping 


(a, b) = (.a1a2a3 eee .b1b2b3 avert ) b> (> aj,2-”, S- me) : 
n=1 n=1 


where a}, = an/2, 6}, = bn/2, n > 1, is well defined. Since any real in [0, 1] 
has a binary expansion, this mapping is a surjection of the Cantor set C' onto 
the unit square Co. Since Co is uncountable (Exercise 1.7.4), we conclude 
that C is uncountable (§1.7). 

The difficulty of measuring the size of the Cantor set underscores the 
difficulty in arriving at a consistent notion of area. Above, we saw that the 
Cantor set is uncountable. In this sense, the Cantor set is “big.” On the other 
hand, note that the areas of the subsquares removed from Co to obtain C;, 
sum to 5/37. Similarly, the areas of the sub-subsquares removed from Cj 
to obtain Cz sum to 20/97. Similarly, at the next stage, we remove squares 
with areas summing to 80/277. Thus, the sum of the areas of all the removed 
squares is 


BPO OU occ oe Pug 2 4)" Se ee 
g ° 92 " 272 9 9" \9 ae Aa Te 


Since C’ is the complement of all these squares in Cp and Cp has area 1, the 
area of C is 1 — 1 =0. Thus, in the sense of area, the Cantor set is “small.” 

This argument is perfectly reasonable, except for one aspect. We are as- 
suming that areas can be added and subtracted in the usual manner, even 
when there are infinitely many sets involved. In §4.2, we show that, with 
an appropriate definition of area, this argument can be modified to become 
correct, and the area of C is in fact zero. 

Another indication of the smallness of C' is the fact that C has no interior. 
To explain this, given any set A C R?, let us say that A has interior if we 
can fit some rectangle Q within A, i.e., Q C A. If we cannot fit a (nontrivial) 
rectangle, no matter how small, within A, then we say that A has no interior. 
For example, the unit disk has interior, but a line segment has no interior. The 
Cantor set C' has no interior, because there is a point in every rectangle whose 
coordinate ternary expansions contain at least one digit 1. Alternatively, if 
C contained a rectangle Q, then the area of C would be at least as much as 
the area of Q, which is positive. But we saw above that the area of C equals 
Zero. 

Since this reasoning applies to any set, we see that if A C R? has interior, 
then the area of A is positive. The surprising fact is that the converse of 
this statement is false. There are sets A C R? that have positive area but 
have no interior. Such a set C’~ is described in Exercise 4.1.2. To add to the 
confusion, even though C® has no interior, there is (Exercise 4.2.15) some 
rectangle Q such that area(C® M Q) is at least .99 area (Q). 
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These issues are discussed to point out the existence of unavoidable phe- 
nomena involving area where things do not behave as simply as triangles. In 
the first three decades of this century, these issues were finally settled. The 
solution to the problem of area, analyzed extensively by Archimedes more 
than two thousand years ago, can now be explained in a few pages. Why 
did it take so long for the solution to be discovered? It should not be too 
surprising that one missing ingredient was the completeness property of the 
set of real numbers, the importance of which was not fully realized until the 
nineteenth century. 


Exercises 


4.1.1. Let Co = [0,1] x [0,1] denote the unit square, and let C{, be obtained 
by throwing out from Co the middle subrectangle (1/3, 2/3) x [0,1] of width 
1/3 and height 1. Then C{ consists of two compact subrectangles. Let C5 be 
obtained from C, by throwing out, in each of the subrectangles, the middle 
sub-subrectangles (1/9, 2/9) x [0, 1] and (7/9, 8/9) x [0, 1], each of width 1/3? 
and height 1. Then C4 consists of four compact sub-subrectangles. Similarly 
C4 consists of eight compact sub-subsubrectangles, obtained by throwing 
out from C} the middle sub-subsubrectangles of width 1/3? and height 1. 
Continuing in this manner, we have CD Ch > CD .... Let C’ =(\?_, Ch. 
Show that area(C’) = 0 and C’ has no interior. 


4.1.2. Fix areal 0 < a < 1 (e.g., a =.7) and let Co = [0, 1] x [0, 1] be the unit 
square. Let C/ be obtained from Co by throwing out the middle subrectangle 
of width a/3 and height 1. Then Cf? consists of two subrectangles. Let CS! be 
obtained from Cf by throwing out, in each of the subrectangles, the middle 
sub-subrectangles of width a/3? and height 1. Then C% consists of four sub- 
subrectangles. Similarly C’s' consists of eight sub-subsubrectangles, obtained 
by throwing out from C the middle sub-subsubrectangles of width a/3° and 
height 1. Continuing in this manner, we have C? D CSD C3 D .... Let 


C= (Ce 
n=1 
Show that area(C®) > 0, but C® has no interior. 
4.1.3. For A C R?, let 


A+A={(e+a',y+y'): (2, y) € A, (a’,y') € A} 


be the set of sums. Show that C + C = [0,2] x [0,2] (see Exercise 1.6.5). 
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4.2 Area 


Let J and J be intervals, i.e., subsets of R of the form (a, 6), [a,b], (a, }], 
or [a,b). As usual, we allow the endpoints a or b to equal too when they 
are not included within the interval. A rectangle is a subset of R? = R x R 
(Figure 4.3) of the form I x J. A rectangle Q = I x J is open if I and J are 
both open intervals, closed if J and J are both closed intervals, and compact 
if J and J are both compact intervals. For example, the plane R? and the 
upper-half plane R x (0,00) are open rectangles. We say that a rectangle 
Q = Ix J is bounded if I and J are bounded subsets of R. For example, 
the vertical line segment {a} x [c,d] is a compact rectangle. A single point 
is a compact rectangle. If J is a bounded interval, let J denote the compact 
interval with the same endpoints, and let [° denote the open interval with the 
same endpoints. If Q = I x J is a bounded rectangle, the compact rectangle 
Q =I x J is its compactification. If Q = I x J is any rectangle, then the open 
rectangle Q° = I° x J° is its interior. Note that Q° C QC Q and Q \ Q° is 
a subset of the sides of Q, for any rectangle @. Note that an open rectangle 
may be empty, for example, (a,a) x (c,d) is empty. 


Fig. 4.3 A and B are rectangles, but C is not 


Let A be a subset of R?. A countable cover of A is a sequence of sets (An) 
such that A is contained in their union,! 


Ac U An. 


n=1 


In a given cover, the sets (A,) may overlap, i.e., intersect. If, for some N, 

A, = 0 for n > N, we say that (A,,..., Ay’) is a finite cover (Figure 4.4). 
A paving of A is a countable cover (Q,,), where the sets Q,, n > 1, are 

rectangles. A finite paving is a finite cover that is also a paving (Figure 4.5). 


* Uo, An = Uf{An : n € N} and %_, An = (\{An : n € N}, see §2.1. 


n=1 
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Fig. 4.4 A finite cover 


Every subset A C R? has at least one (not very interesting) paving Q; = R?, 
Q2 = 9, Q3 =9,.... 


Fig. 4.5 A finite paving 


For any interval I as above, let |I| = b — a denote the length of I. For 
any rectangle Q = I x J, let ||Q|| = |Z|-|J|. More generally, for any rotated 
rectangle, let ||@Q|| denote the product of the lengths of its sides (length is 
defined in §3.6). Then ||Q]|, the traditional high-school formula for the area 
of a rectangle, is a positive real or 0 or oo. We also take ||@|| = 0. For reasons 
discussed below, we call ||Q|| the naive area of Q. 

Let A be a subset of R?. The area? of A is defined by 


area (A) = inf 1 |Qn|| : all pavings (Q,,) of 4| ; (4.2.1) 


n=1 


This definition of area is at the basis of all that follows. It is necessarily 
complicated because it applies to all subsets A of R?. As an immediate con- 
sequence of the definition, area (@) = 0. Similarly, the area of a finite vertical 
line segment A is zero since A can be covered by a thin rectangle of arbitrarily 
small naive area. 

In words, the definition says that to find the area of a set A, we cover A 
by a sequence Q1, Qo, ...of rectangles, measure the sum of their naive areas, 
and take this sum as an estimate for the area of A. Of course, we expect that 


2 The usual terminology is two-dimensional Lebesgue measure. 
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this sum will be an overestimate of the area of A for two reasons. The paving 
may cover a superset of A, and we are not taking into account any overlaps 
when computing the sum. Then we define the area of A to be the inf of these 
sums. 

Of course, carrying out this procedure explicitly, even for simple sets A, 
is completely impractical. Because of this, we almost never use the definition 
directly to compute areas. Instead, as is typical in mathematics, we derive 
the elementary properties of area from the definition, and we use them to 
compute areas. 

Whether or not we can compute the area of a given set, the above definition 
applies consistently to every subset A. In particular, this is so whether A is a 
rectangle, a triangle, a smooth graph, or the Cantor set C’. Let us now derive 
the properties of area that follow immediately from the definition. 

Since every rectangle Q is a paving of itself, area(Q) < ||Q||. Below, we 
obtain area (Q) = ||Q|| for a rectangle Q. Subsequently, we establish rotation 
invariance of area and obtain area(Q) = ||Q]|| for rotated rectangles Q (rota- 
tion invariance of the lengths of the sides is in §3.6). Until we establish this, 
we repeat that we refer to ||Q|| as the naive area of Q. 

If (a,b) is a point in R? and A C R?, the set 


A+ (a,b) ={(x+a,y+ 6): (x,y) € A} 


is the translate of A by (a,b). Then [A + (a,6)] + (c,d) = A+ (a+c,b+4+d) 
and, for any rectangle Q, Q + (a,b) is a rectangle and ||Q + (a, b)|| = ||Q||. 
From this follows the translation invariance of area, 


area[A + (a, b)] = area (A), ACR’. 


To see this, let (Q,) be a paving of A. Then (Q,, + (a,6)) is a paving of 
A+ (a,b), so 


area[A + (a,b)] < S>]Qn + (a,0)l] = S~ [IQull- 


n=1 


Since area (A) is the inf of the sums on the right, area[A + (a, b)] < area(A). 
Now in this last inequality, replace, in order, (a,b) by (—a,—b) and A 
by A + (a,b). We obtain area(A) < area[A-+ (a,b)]. Hence, area(A) = 
area[A + (a, b)], establishing translation invariance (Figure 4.6). 

If k > 0 is real and AC R?, the set 


kA = {(ka, ky) : (a, y) € A} 


is the dilate of A by k. Then k(cA) = (kc)A for k and c positive, kQ is a 
rectangle, and ||kQ|| = k?||Q|| for every rectangle Q. From this follows the 
dilation invariance of area, 


area(kA) =k?-area(A), AC R?. 
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(a, 6) 


Fig. 4.6 area (A) = area[A + (a, b)| 


To see this, let (Q,,) be a paving of A. Then (kQ,,) is a paving of kA so 


area (kA) <2 |kQn|| = k? (> Iau). 
n=1 n=1 


Since area(A) is the inf of the sums on the right, we obtain area(kA) < 
k? - area(A). Now in this last inequality, replace, in order, k by 1/k and A 
by kA. We obtain k?- area (A) < area(kA). Hence, area (kA) = k? - area (A), 
establishing dilation invariance (Figure 4.7). 


Fig. 4.7 area(kA) = k?- area(A) 


Instead of dilation from the origin, we can dilate from any point in R?. 
In particular, by elementary geometry, certain subsets such as rectangles, 
triangles, and parallelograms have well-defined centers, and we can dilate 
from these centers. Given a subset A and an arbitrary point (a, b), its centered 
dilation kA (from (a, 6)) is the set (Figure 4.8) obtained by translating (a, b) 
o (0,0), dilating as above by the factor & > 0 and then translating (0,0) 
back to (a,b). Then area(kA) = k?- area(A) for centered dilations as well. 
For example, if k < 1, kA is A shrunk towards (a, b). 
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Fig. 4.8 Centered dilation 


Similarly, for & € R, let H(a,y) = (ka,y) and V(a,y) = (a, ky); then 
H(A) and V(A) are the horizontal dilation and vertical dilation of A, and 
area(V(A)) = |k|area(A) and area(H(A)) = |k|area(A) for every set A 
(Exercise 4.2.6). Note these dilations incorporate reflections when k < 0. 

The map F(az,y) = (y,x) is a reflection across the line y = 2; if Q isa 
rectangle, then so is F'(Q), and ||F'(Q)|| = ||Q]]. It follows as above that the 
flip F preserves area, area(F(A)) = area (A). 

As with dilation, setting —A = {(—x, —y) : (x,y) € A}, we have reflection 
invariance of area, 


area (—A) = area(A), ACR’, 
and monotonicity, 
area(A) < area(B), ACBCR’. 


From the above, we have area (kA) = |k|? area(A) for all k real, whether the 
dilation is centered or not. 

The diagonal line segment A = {(r,y): 0 <a < 1, y = 2} has zero area 
(Figure 4.9). To see this, choose n > 1 and let Q, = {(z,y) : (k-—1)/n < 
u<k/n, (k-1)/n<y<k/n},k=1,...,n. Then (Qi,...,Qn) is a paving 
of A. Hence, area (A) < ||Qi|| +--+ + ||Qn|] = 7-4 =1/n. Since n > 1 may 
be arbitrarily large, we conclude that area(A) = 0. Similarly the area of any 
finite line segment is zero. 

Another property is subadditivity. For any countable cover (A,,) of a given 
set A, 


area(A) < S- area(A,). (4.2.2) 
n=1 
Here the sets A,, n > 1, need not be rectangles. For future reference, we 


call the sum on the right side of (4.2.2) the area of the cover (An). 
In particular, since (A, B) is a cover of AU B, 


area(AU B) < area(A) + area(B). 
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Fig. 4.9 Area of a diagonal line segment 


Similarly, (A;, Ag,..., An) is a cover of Ay UAgU...U An,? so 
area(A, U Ag U...U Ap) < area(A,) + area(Ag) +--+ + area(A,). 


To obtain subadditivity, note that, if the right side of (4.2.2) is oo, there 
is nothing to show, since in that case (4.2.2) is true. Hence, we may safely 
assume area (A,,) < oo for all n > 1. Let € > 0, and, for each k > 1, choose* 
a paving (Qzx,n) of A, satisfying 


S— [lQk,nl] < area (Ax) + €27*. (4.2.3) 


n=1 


This is possible since area(A,) is the inf of sums of the form S7*~, ||Qn|l. 
Then the double sequence (Qxn) is a cover by rectangles, hence a paving of 
A. Summing (4.2.3) over k > 1, we obtain 


wen (a) <3 ibs oa) 


<So( area (Ay) + €2~*) = )= (5 sea) +e. 
k=1 k=1 


Since € > 0 is arbitrary, subadditivity follows. 

Now let Q be any bounded rectangle and let Q and Q° denote its compact- 
ification and its interior respectively. We claim area (Q) = area(Q°). To see 
this, by monotonicity, we have area (Q) > area (Q°). Conversely, let ¢ > 1 and 
let tQ° denote the centered dilation of Q°. Then tQ° > Q so by monotonicity 
and dilation, area (Q) < area(tQ°) = t? area(Q°). Since t > 1 is arbitrary, 
we have area (Q) < area(Q°); hence, the claim follows. As a consequence, 
we see that for any bounded rectangle Q, we have area (Q) = area (Q). 


3 UR_, An, N= Ax are defined by induction; see §2.1. 
4 This uses the axiom of countable choice. 
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Theorem 4.2.1. The area of a rectangle Q equals the product of the lengths 
of its sides, area(Q) = ||Q]]. 


The derivation of this nontrivial result is in several steps. 


Step 1 


Assume first Q is bounded. Since area(Q) = area (Q) and ||Q|| = |Q , we 
may assume without loss of generality that Q is a compact rectangle. Since 
we already know area(Q) < ||Q||, we need only derive ||Q|| < area(Q). By 
the definition of area (4.2.1), this means we need to show that 


Ql < So Qn (4.2.4) 


for every paving (Q,) of Q. 

Let (Qn) be a paving of Q. We say (Q,,) is an open paving if every rectangle 
Q,, is open. Suppose we established (4.2.4) for every open paving. Let t > 1. 
Then for any paving (Q,,) (open or not), (tQ°,) is an open paving since 
Qn C tQ® for n > 1; under the assumption that (4.2.4) is valid for open 
pavings, we would have 


OI < So eQell =e? So Ohl =e? S2 l@nll- (4.2.5) 
n=1 n=1 n=1 


Since t > 1 in (4.2.5) is arbitrary, we would then obtain (4.2.4) for the 
arbitrary paving (Q,,). Thus, it is enough to establish (4.2.4) when the paving 
(Qn) is open and Q is a compact rectangle. 


Step 2 


Assume now Q is a compact rectangle and the paving (Q,,) is open. We 
show there is an N > 0 such that Q is contained in the finite union 
Q, UQ2U...UQn. We argue by contradiction: Suppose that there is no 
such N. Then for each n > 1, we may select? (apn, yn) € Q\ Qi U...UQn. 
By Exercise 2.1.1, (tn, Yn) subconverges to some (x,y) € Q. Now select 
N > 1 such that (x,y) € Qn. Then (an, Yn) € Qw for infinitely many n, 
contradicting the construction of the sequence (an, Yn). 


5 The axiom of countable choice may be avoided as in the proof of Theorem 2.1.3. 
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Step 3 


We are reduced to establishing 


Ql] < [lQa|] +--+ ]Qu| (4.2.6) 


whenever Q C Q, U...UQw and Q is a compact rectangle and Qi,...,QN 
are open. 

Since area(Q°) = area (Q), a moment’s thought shows that we may 
assume both Q and Q,...,Qn are compact rectangles. 

Moreover, since ||QN Qnl| < ||Qn|| and (QN Qn) is a paving of Q, by 
replacing (Qn) by (QN Qn), we may additionally assume Q = Q1U...UQn. 

This now is a combinatorial, or counting, argument. Write Q = I x J and 
Qn = In X Jn, n = 1,...,N. Let cp < cy < --: < ec, denote the distinct 
left and right endpoints of l),...,JxN, arranged in increasing order, and set 
Ti = [g-1,q], 7 = 1,...,r. Let do < dy < -:- < ds denote the distinct 
left and right endpoints of Jj,...,Jn, arranged in increasing order, and set 
J; = [dj-1, d;], 7 = i eee Let Qi); = I; x Ji, = i Deere 4g = | ee 
Then: 


A. The rectangles Qj; intersect at most along their edges, 
B. The union of all the Qj;,i=1,...,r, 7 =1,..-,s, equals Q, 
C. the union of all the Qi; contained in a fixed Q,, equals Qn. 


Let cijn equal 1 or 0 according to whether Qi; C Qy or not. Then 


Ql = 5° ]Q4,| (4.2.7) 
tJ 
and 
Qn || = SY cegn|| QI, l<n<Qn, (4.2.8) 
ij 


since both sums are telescoping. Combining (4.2.8) and (4.2.7) and inter- 
changing the order of summation, we get 


NON = dQ < D7 DL cemll isl] = DD cesnll sll = Dal 
4,9 ag on n 49 n 


This establishes (4.2.6). 


Step 4 


Thus, we have established (4.2.4) for finite pavings, hence by Step 2, for all 
pavings. Taking the inf over all pavings (Q,,) of Q in (4.2.4), we obtain ||Q]| < 
area(Q); hence, area(Q) = ||Q|| for Q a compact rectangle. As mentioned 
in Step 1, this implies the result for every bounded rectangle Q. When Q 
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is unbounded, @ contains bounded subrectangles of arbitrarily large area; 
hence, the result follows in this case as well. 

The reader may be taken aback by the complications involved in estab- 
lishing this intuitively obvious result. Why is the derivation so complicated? 
The answer is that this complication is the price we have to pay if we are 
to stick to our definition (4.2.1) of area. The fact that we will obtain many 
powerful results—in a straightforward fashion—easily offsets the seemingly 
excessive complexity of the above result. In part, we are able to derive the 
powerful results in the rest of this Chapter and in Chapters 5 and 6, because 
of our decision to define area as in (4.2.1). The utility of such a choice can 
only be assessed in terms of the ease with which we obtain our results in 
what follows. 

We now return to the main development. 

We can compute the area of a triangle A with a horizontal base of length 
b and height A by constructing a cover of A consisting of thin horizontal 
strips (Figure 4.10). Let ||A|| denote the naive area of A, i.e., || Al] = hb/2. 
By reflection invariance, we may assume A lies above its base. Let us first 
assume the two base angles are non-obtuse, i.e., are at most 7/2. Since every 
triangle may be rotated into one whose base angles are such, this restriction 
may be removed after we establish rotation invariance below: 


Fig. 4.10 Cover of a triangle 


Divide A into n horizontal strips of height h/n as in Figure 4.10. Then the 
length of the base of each strip is b/n shorter than the length of the base of 
the strip below it, so (Exercise 1.3.12) 


deat 2 ni Out +b 4..5 
=n game eye Me MED _ SMD 


Now let n 7 oo to obtain area(A) < (hb/2) = |All. 

To obtain the reverse inequality, draw two other triangles B, C with hor- 
izontal bases, such that AU BUC is a rectangle and A, B, and C intersect 
only along their edges. Then by simple arithmetic, the sum of the naive areas 
of A, B, and C equals the naive area of AU BUC, so by subadditivity of 
area, 
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|Al] + Bl + ]Cl] = |AUBUC| 
=area(AU BUC) 
area (A) + area (B) + area (C) 


area (A) + ||B]| + ||C]]. 


< 
< 


Cancelling ||B||, ||C||, we obtain the reverse inequality ||A|| < area (A). 


Theorem 4.2.2. The area of a triangle equals half the product of the lengths 
of its base and its height. 


We have derived this theorem assuming that the base of the triangle is 
horizontal. The general case follows from rotation invariance, which we do 
below. 

Our next item is additivity. In general, we do not expect area(AU B) = 
area (A) + area(B) because A and B may overlap, i.e., intersect. If A and B 
are disjoint, one expects to have additivity. To what extent this is true leads 
to measurable sets (§4.5). Here we establish additivity only for the case when 
A and B are well separated. Exercises 4.5.12 and 4.5.13 discuss a broader 
case. 

If A C R? and BC R’, set 


d(A, B) = inf /(a—c)? + (b- d)?, 


where the inf is over all points (a,b) € A and points (c,d) € B. We say A 
and B are well separated if d(A, B) is positive (Figure 4.11). For example, 
although {(2,0)} and the unit disk are well separated, Q x Q and {(V/2,0)} 
are disjoint but not well separated. Note that, since inf = oo, A empty 
implies d(A, B) = oo. Hence, the empty set is well separated from any subset 
of R?. 


Fig. 4.11 Well-separated sets 


If the lengths of the sides of a rectangle Q are a and b, by the diameter of 
Q, we mean the length of the diagonal Va? + 6?. 
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Theorem 4.2.3 (Well-Separated Additivity). If A and B are well sep- 
arated, then® 


area (AU B) = area(A) + area(B). 


By subadditivity, area (A U B) < area(A)+ area (B), so we need show only 
that 


area (AU B) > area(A) + area(B) (4.2.9) 


If area(AU B) = ~, (4.2.9) is true, so assume area(AU B) < oo. In this 
case, to compute the area of AU B, we need consider only pavings involving 
bounded rectangles, since the sum of the areas of rectangles with at least 
one unbounded rectangle is oo. Let « = d(A, B) > 0. If (Qn) is a paving of 
AUB with bounded rectangles Q,,, n > 1, divide each Q,, into subrectangles 
all with diameter less than ¢. Since ||Q,,|| equals the sum of the areas of its 
subrectangles, by replacing each Q,, by its subrectangles, we obtain a paving 
(Q/,) of AUB by rectangles Q/, of diameter less than € and 


Sell = do Qu. 
n=1 n=1 


Thus, for each n > 1, Q’, intersects A or B or neither but not both. Let (QA) 
denote those rectangles in (Q/,) intersecting A, and let (Q?) denote those 
rectangles in (Q/,) intersecting B. Because no Q’, intersects both A and B, 
(QA) is a paving of A and (Q®) is a paving of B. Hence, by subadditivity, 


area (A) + area(B) < > ||Q2|| + 52 ||Q?|| 
n=1 n=1 


SS Onl = dF Qual. 
n=1 n 


=1 


Taking the inf over all pavings (Q,,) of A U B, we obtain the result. 

As an application, let A denote the unit square [0,1] x [0,1] and B the 
triangle obtained by joining the three points (1,0), (1,1), and (2,1). We 
already know that area(A) = 1 and area(B) = 1/2, and we want to conclude 
that area(A UB) = 14 1/2 = 3/2 (Figure 4.12). But A and B are not well 
separated, so we do not have additivity directly. Instead, we dilate A by a 
factor 0 < a < 1 towards its center. Then the shrunken set aA and B are 
well separated. Moreover, aA C A, so 


area(AU B) > area((a@A) U B) = area (aA) + area (B) 
= a” - area(A) + area(B) = a? + 1/2. 


6 In Caratheodory’s terminology [6], this says area is a metric outer measure. 
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Since a@ can be arbitrarily close to 1, we conclude that area(AU B) > 3/2. 
Since subadditivity yields area(A U B) < area(A)+area(B) = 141/2 = 3/2, 
we obtain the result we seek, area(AU B) = 3/2. 


Fig. 4.12 Area of AUB 


More generally, let P be a parallelogram with horizontal or vertical base 
and let ||P|| denote its naive area, i.e., the product of the length of its base 
and its height. Then we leave it as an exercise to show that area (P) = ||P||. 


Theorem 4.2.4. The area of a parallelogram equals the product of the lengths 
of its base and height. 


We have derived this result assuming the base of the parallelogram is 
horizontal or vertical. The general case follows after we establish below the 
rotation invariance of area. 

Now we turn to the rotation invariance of area or, more generally, the 
affine invariance of area. 

A linear map is a map T : R? + R? given by T(z, y) = (ax + by, cx + dy). 
Examples of linear maps are 


Centered dilations D(x, y) = (kx, ky), 

Horizontal dilations H (x,y) = (kx, y), 

Vertical dilations V(x, y) = (a, ky), 

Flips F(z, y) = (y,2), 

Horizontal shears S(x,y) = (x + ty, y), 

Vertical shears S(x,y) = (a, y + tz), 

Rotations R(x, y) = («cos @ — ysin@, x sin 8 + ycos 8), 
Upper-triangular maps U(a, y) = (ax + by, dy), and 
Lower-triangular maps L(x, y) = (ax,ca + dy). 


The determinant of a linear map T(z, y) = (ax + by, cx + dy) is the real 
det(T’) = ad — be. For example, we have det(D) = k?, det(H) = det(V) = k, 
det(F’) = —1, det(S’) = 1, and det(R) = 1. A linear map T is affine if 
det(T) = +1. 


Theorem 4.2.5. [Affine Invariance of Area] Let A C R?. If T is a linear 
map, then 
area (T'(A)) = | det(T)| area (A) . 
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In particular, note that this establishes the rotation invariance of area 
(Figure 4.13). 


Fig. 4.13 Affine invariance of area 


This result has already been established when T is a flip or a dilation. To 
establish the result when T = S' is a shear, note that S(Q) is a parallelogram 
P with a horizontal or vertical base, when Q is a rectangle. Since we know 
area(P) = ||P||, it follows that area(T(A)) = area(A) when T’ is a shear and 
A is a rectangle. The case of a general set A is now derived as before by the 
use of pavings. 

If (Q,,) is a paving of A, then (S(Q,,)) is a cover of S'(A); hence, 


area ($(A)) < S>area($(Qn)) = J-[IQnll 


n=1 


Taking the inf over all pavings of A, we obtain area (S(A)) < area (A). Replac- 
ing S by its inverse T(z, y) = (a—ty, y) (or T(x, y) = (x, y—tz) respectively), 
and A by S(A), yields area(A) < area(S(A)). Thus, area (S(A)) = area (A). 

Thus, the result is established for flips, shears, and dilations. The deriva- 
tion of the general result now depends on the following basic facts’: 


A. The composition (§1.1) T; o T2 of linear maps T}, T is a linear map. 
B. If Theorem 4.2.5 holds for linear maps T; and T>, then it holds for their 
composition TJ) 0 T>. 


For example, since upper-triangular maps and lower-triangular maps are 
compositions of shears, horizontal dilations, and vertical dilations (Exercise 
4.2.9), the result holds for upper-triangular and lower-triangular maps. More- 
over, since every linear map T is either of the form LoU or of the form FoU 
(Exercise 4.2.9), Theorem 4.2.5 follows. 

Because the derivations of A and B above play no role in the subsequent 
development, we relegate them to the exercises. 

By induction, well-separated additivity of area holds for several sets. Jf 
Aj, Ao,..., An, are pairwise well-separated subsets of R?, then 


7 These group properties are basic to much of mathematics. 
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area(A,;U...UA,,) = area(A1) +---+area(A,). (4.2.10) 


To see this, (4.2.10) is trivially true for n = 1, so assume (4.2.10) is true 
for a particular n > 1, and let Aj, ...,A,+41 be pairwise well separated. Let 
ej = d(A;, An+1) = 0, j = 1, ean Tbs Since 


d(Ay U...U An, An+1) = min(e1,...,€n) > 0, 
An+1 and A,U...UA, are well separated. Hence, by the inductive hypothesis, 


area(Ay U...UAn4i) = area(Ay U...U Ap) + area (An41) 
= area(A;) +---+ area(A,,) + area(An4+i). 


By induction, this establishes (4.2.10) for all n > 1. 
More generally, if (An) is a sequence of pairwise well-separated sets, then 


area (U 4] = s area(An). (4.2.11) 


To see this, subadditivity yields 


For the reverse inequality, apply (4.2.10) and monotonicity to the first N 
sets, yielding 


fore) N N 
area (U 4] > area (U 4] = S- area (A,). 


n=1 


Now let N // 00, obtaining 


area (U 4n] > = area(A,). 


n=1 


This establishes (4.2.11). 

As an application of (4.2.11), we can now compute the area of the Cantor 
set C’. The Cantor set is constructed by removing, at successive stages, smaller 
and smaller open subsquares of Co = [0,1] x [0,1]. Denote these subsquares 
Qi, Q2,... (at what stage they are removed is not important). Then for each 
n, C and Q, are disjoint, so for 0 < a < 1, C and the centered dilations aQ, 
are well separated. Moreover, for each m,n, the centered dilations aQ, and 
QQ m are well separated. But the union of C with all the squares aQy, n > 1, 
lies in the unit square Co. Hence, by (4.2.11), 
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area (C) + s: area (aQ,,) = area (c U (U 20,)) < area(Co) = 1. 


n=1 n=1 


In the previous section, we obtained, }°*~_, area(Q,,) = 1. By dilation invari- 
ance, this implies area (C)+a? < 1. Letting a 7 1, we obtain area(C)+1 < 1 
or area(C’) = 0. 


Theorem 4.2.6. The area of the Cantor set is zero. 


Exercises 


4.2.1. Establish reflection invariance and monotonicity of area. 


4.2.2. Show that the area of a bounded line segment is zero and the area of 
any line is zero. 


4.2.3. Let P be a parallelogram with a horizontal or vertical base, and 
let ||P|| denote the product of the length of its base and its height. Then 
area (P) = ||P|j. 


4.2.4. Compute the area of a trapezoid. 


4.2.5. If A and B are rectangles, then area(A U B) = area(A) + area(B) — 
area(AM B). 


4.2.6. For k € R, define H : R? > R? and V : R? = R? by A(z,y) = 
(ka, y) and V(a,y) = (x, ky). Then area[V(A)] = |k|-area(A) = area[H(A)] 
for every A C R?. 


4.2.7. A mapping T : R? > R? is linear if it is of the form T(z,y) = 
(ax + by,cx + dy) with a,b,c,d € R. The determinant of T is the real 
det(T’) = ad — bc. If T and T” are linear, show that To T” is linear and 
det(T o T’) = det(T) det(T”). 


4.2.8. Show that if affine invariance of area holds for T and for T’, then it 
holds for T 0 T’. 


4.2.9. Show that every upper-triangular map U may be written as a compo- 
sition Vo So H. Similarly, every lower-triangular map satisfies L = HoSoV 
for some H, S,V.If T(x, y) = (ax+by, cv+dy) satisfies a 4 0, show T = LoU 
for some L,U. If a =0, show T = F'oU for some U. 


4.2.10. Show that {(/2,0)} and the unit disk are well separated, but 
{(./2,0)} and Q x Q are not. 


4.2.11. Let D be the unit disk, and let Dt = {(x, y) : a?+y? <1 and y > O}. 
Show that area(D) = 2-area(Dt). 
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4,2.12. Compute the area of the sets C’ and C® described in Exercises 4.1.1 
and 4.1.2, using the properties of area. 


4,2.13. Let Py, = (cos(2rk/n), sin(2rk/n)), k =0,1,...,n. Then Po, Pi, ..., 
P,, are evenly spaced points on the unit circle with P, = Po. Let D,, denote 
the n-sided polygon obtained by joining the points P,. Compute area (D,,). 


4.2.14. Let A C R?. A triangular paving of A is a cover (T;,) of A where 
each T,,, n > 1, is a triangle (oriented arbitrarily). With area (A) as defined 
previously, show that 


area (A) = inf 1S ||Tp|| : all triangular pavings (T;,) of 4| : 


n=1 


Here ||T|| denotes the naive area of the triangle T, i.e., half the product of 
the length of the base times the height. 


4.2.15. Let A Cc R’. If area(A) > 0 and 0 < a < 1, there is some rectangle 
Q, such that area(QM A) > a- area(Q). (Argue by contradiction, and use 
the definition of area.) 


4.3 The Integral 


Let f be defined on an open interval (a,b), where, as usual, a may equal —oo 
or b may equal oo. We say f is bounded if |f(x)| < M, a < x < b, for some 
real M. If f is nonnegative, i.e., if f(x) > 0, a < a < b, the subgraph of f 
over (a,b) is the set (Figure 4.14) 


G=({(z,y):a<2<b,0<y< f(x)} CR’. 


Note that the inequalities in this definition are strict. 


Fig. 4.14 Subgraphs of nonnegative functions 


For nonnegative f, we define the integral’ of f from a to b to be the area 
of its subgraph G, 


8 The usual terminology is Lebesgue integral. 
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b 
/ f(x) dx = area(G). 


Then the integral is simply a quantity that is either 0, a positive real, or oo. 
The reason for the unusual notation fp f(x) dx for this quantity is explained 
below. 

Thus, according to our definition, every nonnegative function has an inte- 
gral, and integrals of nonnegative functions are areas—nothing more, nothing 
less—of certain subsets of R?. 

Since the empty set has zero area, we always have i f(a) dx = 0. For 
each k > 0, the subgraph of f(x) = k, a < x < b, over (a,b) is an open 
rectangle, so 


[kas = Koa) 


Since the area is monotone, so is the integral: If 0 < f < g on (a,b), 


[seas f otorae 


In particular, 0 < f < M on (a,6) implies 0 < Hi f(x) dx < M(b— a). 

A nonnegative function f is integrable over (a, b) if f f(a) dx < ov. For ex- 
ample, we have just seen that every bounded nonnegative f is integrable over 
a bounded interval (a,b). Now we discuss the integral of a signed function, 
i.e., a function that takes on positive and negative values. 

Given a function f : (a,b) > R, we set 

f* (a) = max{[f (2), 0, 
and 

f° (x) = max[— f(x), 0]. 
These are (Figure 4.15) the positive part and the negative part of f, respec- 
tively. Note that ft — f~— =f and ft+f7 =|f\. 


Fig. 4.15 Positive and negative parts of sin x 
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We say a signed function f is integrable over (a,b) if 


[ut@lar <x. 


In this case, ie fF(x)dzx < i |f(x)| dx are both finite. For integrable f, we 
define the integral f f(x) dx of f from a to b by 


[i@e=frew- free 


From this follows 
b b 
[fede =— [ to)ae 


for every integrable function f, since g = —f implies g* = f7~ and io = 
We warn the reader that, although cL f(x) dx = rid ft (a) dx— le f(z) 
is a definition, we have not verified the — fi °F (a)| i a nies ft (a) dx 
i f(x) dx for general integrable f. For more on this, see §6.1. 
tee. has the above discussion, we see that every bounded (signed) func- 
tion is integrable over a bounded interval. For example, sinx and sinx/x are 
integrable over (0,7). In fact, both functions are integrable over (0, b) for any 


os 


finite b, and hence, de sin x dx and he (sin z/x) dx are defined. 

It is reasonable to expect that sin is not integrable over (0,00). Indeed 
the subgraph of | sin z| consists of a union of sets G,, n > 1, where each G,, 
denotes the subgraph over ((n — 1)7,n7). By translation invariance, the sets 
G,, n> 1, have the same positive area and the sets G,,G3,G5,..., are well 
separated. Hence, we obtain 


(sinx)t dx = area (U Gm) - S- area (Gon—1) = 
0 n=1 n=1 


By considering, instead, G2, G4, Gg,..., we obtain Jo (sin x)~ dx = oo. Thus, 


| (sina)* de f (sina) dx = co — ov. 
0 0 


Hence, f)~ sina dx cannot be defined as a difference of two areas. The trick 
of considering every other set in a sequence to force well-separatedness is 
generalized in the proof of Theorem 4.3.3. 

It turns out that sinw/z is also not integrable over (0,00). To see this, let 
G,, denote the subgraph of | sina/x| over ((n—1)a,n7), n > 1 (Figure 4.16). 
In each G,,, we can insert a rectangle Q,, of area V2/(4n—1) so that the rect- 
angles are well separated (select Q,, to have base the open interval obtained 
by translating (7/4, 37/4) by (n — 1)z and height as large as possible—see 
Figure 4.16). By additivity, then we obtain 
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[ {sual dx > wa (U 2) = Y aven( Qn) = > ms = oO, 


by comparison with the harmonic series. Thus, sinx/x is not integrable over 
(0,00). More explicitly, this reasoning also shows that 


los) : + 
| (=*) dx > area(Q ) + area (Qs3) + area(Qs) +--+ = 00 
0 


x 
and 
°° /sina \— 
i; ( = ) dx > area(Q2) + area(Q4) + area(Qg) +--+: = co. 
0 
Thus, 


eo) : + es) . _ 
| (=*) ar — [ (=*) are 
0 v 0 zu 


hence fo sin z/« dx also cannot be defined as a difference of two areas. 

To summarize, the integral of an integrable function is the area of the 
subgraph of its positive part minus the area of the subgraph of its negative 
part. Every property of chs f(x) dx ultimately depends on a corresponding 
property of area. 

Frequently, one checks integrability of a given f by first applying one or 
more of the properties below to the nonnegative function |f|. For example, 
consider the function g(x) = 1/x? for x > 1, and, for each n > 1, let Gy 
denote the compact rectangle [n,n + 1] x [0,1/n?]. Then (G,,) is a cover of 
the subgraph of g over (1,00) (Figure 4.17). Hence, 


Fig. 4.16 The graphs of sinaw/a and | sina|/x 
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which is finite (§1.6). Thus, g is integrable over (1,00). Since the signed 
function f(x) = cosa/x? satisfies |f(x)| < g(x) for 2 > 1, by monotonicity, 


we conclude that = 
COS x 
‘ | = | dx < 0. (4.3.1) 


Hence, cos 2/2? is integrable over (1, 00). 


1 2 3 ay 5 


Fig. 4.17 A cover of the subgraph of 1/x? over (1,00) 


Of course, functions may be unbounded and integrable. For example, 
the function f(z) = 1/,/Z is integrable over (0,1). To see this, let G, de- 
note the compact rectangle [1/(n+1)?, 1/n?] x [0,n+1]. Then (G,,) is a cover 
of the subgraph of f over (0,1) (Figure 4.18). Hence, 


1 co ioe) love) 
1 1 1 2n+1 1 
as < eee a ey 
fees do - aap) “Ler Ole 
which is finite. Thus, f is integrable over (0,1). 


Theorem 4.3.1 (Monotonicity). Suppose that f and g are both nonnega- 
tive or both integrable on (a,b). If f <g on (a,b), then 


[sears f atoae 


If 0 < f < g, we already know this. For the integrable case, note that 
f < g implies f* = max(f,0) < max(g,0) = gt and g~ = max(-—g,0) < 
max(—f,0) = f~ on (a,b). Hence, 
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01/16 1/9 1/4 1 


Fig. 4.18 A cover of the subgraph of 1/,/z over (0, 1) 


[ ft(z) dz < [vw dx, 


[rae free, 


Subtracting the second inequality from the first, the result follows. 
Since +f < |f|, the theorem implies 


+f peyar= f tseoaes [ seolae 


| “foie 


Theorem 4.3.2 (Translation and Dilation Invariance). Let f be non- 
negative or integrable on (a,b). Choose c€ R and k > 0. Then 


and 


which yields 


< [le 


for every integrable f. 


b+e 


b 
[ fleroae=f ta)ae, 


ate 


[rteyae=e [soa 


[ 502) — tf Fe is 


If f is nonnegative, let G denote the subgraph of f(x + c) over (a,b) 
(Figure 4.19). Then the translate G + (c,0) equals 


and 


{(z,y):a+te<a2<b+c60<y< f(a}, 
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which is the subgraph of f(x) over the interval (a + c,b +c). By translation 
invariance of area, we obtain translation invariance of the integral in the 
nonnegative case. If f is integrable, by the nonnegative case, 


b+e 


b 
/ ft(@+e)dzr= ft (2) da, 


a+ 


and 


b b+e 
/ f-(a@+o)dz= fo (a) da. 
a ate 
Now if g(x) = f(w+c), then g*(x) = ft (a+ c) and g~ (x) = f~(x +c). So 
subtracting the last equation from the previous one, we obtain translation 
invariance in the integrable case. 

For the second equation and f nonnegative, recall that from the previous 
section, the dilation mapping V(z,y) = (a,ky), and let G denote the sub- 
graph of f over (a,b). Then V(G) = {(2,y): a<au<b,0<y< kf(a)}. 
Hence, area(V(G)) = i kf(x)dxz. Now dilation invariance of the area 


(Exercise 4.2.6) yields [? kf(x)da = kf’ f(x) dx for f nonnegative. For 
integrable f, the result follows by applying, as above, the nonnegative case 
to ft and f-. 

For the third equation, let H(z,y) = (ka,y), and let G denote the sub- 
graph of f(kax) over (a,b). Then H(G) = {(z,y): ka <a < kbh,0<y< 
f(x)}. The third equation now follows, as before, by dilation invariance. For 
integrable f, the result follows by applying the nonnegative case to f*. 


mC & Cl 


Fig. 4.19 Translation and dilation invariance of integrals 


By similar reasoning, one can also derive (Figure 4.20) 


—a b 
fl—a)de= f(x) ax, 
—b a 
valid for f nonnegative or integrable over (a, 6). 
The next property is additivity. 


Theorem 4.3.3 (Additivity). Suppose that f is nonnegative or integrable 
over (a,b), and choosea<c<b. Then 


[soa [seas [ seae 
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aoa at 


Fig. 4.20 Reflection invariance of integrals 


To see this, first, assume that f is nonnegative. Since the vertical line x = c 
has zero area, subadditivity yields 


[seas [rears [ seoyae 


So we need only show that 


[seaee [rears [ soyae (4.3.2) 


If f is not integrable, (4.3.2) is immediate since the left side is infinite, so 
assume f is nonnegative and integrable. Now, choose any strictly increasing 
sequence a < cy < cg <... converging to c. Then for n > 1, the subgraph of f 
over (a, Cy) and the subgraph of f over (c, b) are well separated (Figure 4.21). 
So by monotonicity and well-separated additivity, 


[s@ dx > [ f(a) dx + f 1) dx. (4.3.3) 


We wish to send n 7 co in (4.3.3). To this end, for each n > 1, let G,, denote 


{(@,y) en <2 <en41,0<y < f(z}. 


Since G2, G4, Ge, ..., are pairwise well separated, 

area (G2) + area (G4) + area (Ge) + a fi f(x) dt < oo. 
Since G1, G3, Gs,..., are pairwise well separated, 

area (G) + area(G) + area(Gs) + vs fH f(x) dx < co. 


Adding the last two inequalities yields the convergence of }**~_, area(G,). 
Hence, the tail (§1.6) goes to zero: 


7 2 aren Gy) = 0. 
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Since the subgraph of f over (cn, c) is contained in Gp UGn41UGnieU..., 
monotonicity and subadditivity imply 


o< f f(x) dx < area (Gy), n>. 
en k=n 


Hence, we obtain 


Ee a f(x) dx = 0. (4.3.4) 


Since by monotonicity and subadditivity, again, 


[ote@acs f tears [° poac [, a 


we conclude that 


lim / f(x) dx = / f(x) dx. 
Now sending n 7 oo in (4.3.3) yields (4.3.2). Hence, the result for f nonneg- 
ative. If f is integrable, apply the nonnegative case to f* and f~. Then 


[reas [reas [seas 


[ f(a) dz = [reo w+ fre) dx. 


Subtracting the second equation from the first, we obtain the result in the 
integrable case. 

The trick in the above proof of considering every other set in a sequence 
to force well-separatedness is generalized in the proof of Theorem 4.5.4. 

The bulk of the derivation above involves establishing (4.3.4). If f is 
bounded, say by M, then the integral in (4.3.4) is no more than M(c— c,), 
hence trivially, goes to zero. The delicacy is necessary to handle unbounded 
situations. 

The main point in the derivation is that, although the subgraph G of f 
over (a,c) and the subgraph G’” of f over (c,d) are not well separated, we still 
have additivity, because we know something—the existence of the vertical 
edges—about the geometry of G and G’. 

In the previous section, when we wanted to apply additivity to several 
sets (e.g., when we computed the area of the Cantor set) that were not well 
separated, we dilated them by a factor 0 < a < 1 and then applied additivity 
to the shrunken sets. 


and 


160 4 Integration 


a Cn Cn+1€ b 


Fig. 4.21 Additivity of integrals 


Why don’t we use the same trick here for G or G’? The reason is that if 
the graph of f is sufficiently “jagged” (Figure 4.22), we do not have aG C G, 
a necessary step in applying the shrinking trick of the previous section. 


Fig. 4.22 A “jagged” function 


By induction, additivity holds for a partition (§2.2) of (a,b): Ifa= 29 < 
Ly <+++ << Up =b and f is nonnegative or integrable over (a,b), then 


[re ee »" Seah (4.3.5) 
: o Joys 


Since the right side does not involve the values of f at the points defining the 
partition, we conclude that the integrals of two functions f : (a,b) > R and 
g : (a,b) + R are equal, whenever they differ only on finitely many points 
G< ay <1) <ap_1 <b. 

Another application of additivity is to piecewise constant functions. A 
function f : (a,b) > R is piecewise constant if there is a partition a = 
typ <2, <-++: < a, = b, such that f, restricted to each open subinterval 
(a;-1,0;), ¢ = 1,...,n, is constant. (Note that the values of a piecewise 
constant function at the partition points 7;, 1 <i<mn-—1, are not restricted 
in any way.) In this case, additivity implies 


[so dx = 5 eli, 
¢ i=1 


where Ax; = xv; — 2-1, 1 = 1,...,n. Since a continuous function can be 
closely approximated by a piecewise constant function (§2.3), the integral 
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should be thought of as a sort of sum with Ax; “infinitely small,” hence the 
notation dx replacing Ax; and f replacing >>. 

This view is supported by Exercise 4.3.3. Indeed, by defining integrals as 
areas of subgraphs, we capture the intuition that integrals are approximately 
sums of areas of rectangles in any paving and not just finite vertical pavings 
as given by the “Riemann sums” of Exercise 4.3.3. 

Also since the integral is, by definition, a combination of certain areas and 
the notation f f(a) dx is just a mnemonic device, the variable inside the int- 


egral sign is a “dummy” variable, i-e., ie f(x) dx = os f (t) dt. Nevertheless, 
the interpretation of the integral as a “continuous sum” is basic, useful, and 
important. 

Let us go back to the integrals of sina and sin a/x over (0,00). Above, we 
saw that these functions were not integrable over (0,00), and so ee sin x dx 
and i. sin xz/a« dx could not be defined as the difference of the areas of the 
positive and the negative parts. An alternate approach is to consider F'(b) = 
fo sin x dx and to take the limit F'(co) = limp_,.. F'(b). However, since the 
areas of the sets G,, n > 1, are equal, by additivity, F(na) = area(G) — 
area (G2) +---+area(G,,) equals area(G1) or zero according to whether n 
is odd or even. Thus, the limit F(co) does not exist, and this approach fails 
for sin x. 

For sinx/x, however, it is a different story. Let F'(b) = fy sinz/adx, and 
let G, denote the subgraph of | sin z|/a over ((n — 1)a,nm) for each n > 1. 


Then by additivity F(n7) = area(G ,) — area(Gz2) +---+area(G,,). Hence, 
lim a dc = area (G';) — area(G2) + area(G3) —... . 
n/co Jo x 


But this last series has a finite sum since it is alternating with decreasing 
terms! Thus, 


| ee dx # lim ome dx (4.3.6) 
0 6 nfo Jo x 

since the left side is not defined and the right side is a well defined, finite 
real. 

More generally (Exercise 4.3.14) F(oo) = limp... F(b) exists and F(oo) 
is computed in Exercise 5.4.12. Even more generally, there is an integral 
version (Exercise 4.4.31) of the Dirichlet test for series: If f is smooth and 
decreasing to zero as x — oo, with f(x)sina and f’(x)(1 — cos) integrable 
over (0,0) for all b > 0, then 


boo 


b 
lim i f(a) sin x dx 
0 


exists. 
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On the other hand, when f is nonnegative or integrable, its integral over 
an interval (a,b) can be obtained as a limit of integrals over subintervals 
(an, 0n) (Figure 4.23), and the behavior (4.3.6) does not occur. 


Theorem 4.3.4 (Continuity at the Endpoints). If f is nonnegative or 
integrable on (a,b) and ay, > at, by — b—, then 


b bn 
; f(x) dx = _ / f(a) da. (4.3.7) 
If f ts integrable on (a,b) and an, — a+, by — b—, then in addition, 
lim / : f(x) dx =0, (4.3.8) 
no Ja 
and 
b 
lim / f(x) dx = 0. (4.3.9) 
noo bn 


To see this, first assume that f is nonnegative and b, “ b, and fix 
a<c<b. Since area is monotone, the sequence 1 f(a) dz, n > 1, is in- 
creasing and 

bn b 
tim | f(x) ax < f f(a) dx. 
nZo Jo o 


For the reverse inequality, let G,, denote the subgraph of f over (bn, bn41), 
n > 1. By additivity, 


bn by n-1 
/ f(a)dx = | f()de + S7 area (Ge), 
& e k=1 
a Qn bn b 


Fig. 4.23 Continuity at the endpoints 


so taking the limit and using subadditivity, 


” Fa) f (x) dz + area (Gr) 
lim f(x) dx = f(a) dx + area (G 
noo c é hel e 
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by b b 
So) se\eea | fas / eid 
by c 


Cc 


Hence, 


im : ei | ” (x) de. (4.3.10) 


noo 
In general, if b, — b—, then by. 7 6 (81.5), and bps < by, < b. Hence, 


bas bn b 
f(e)ar < f floyd < | f(x) dz, n> 1, 


Cc 


which implies (4.3.10), for general b,, + b—. Since 


i fade =f” f(-2) dr, 


applying what we just learned to f(—«) yields 


iia a fla)ae = f° fa)ae 


noo 


[sear= [seas f sea 


c bn 
= tim, | f(x) dx + tim, | f(x) dx 


Hence, 


n/co 


bn 
= tim, | f(a) da. 


noo 


For the integrable case, apply (4.3.7) to f* to get (4.3.7) for f. Since 


[ "pear = f Oi / “Flo)ae, 


we get (4.3.8). Similarly, we get (4.3.9). 
For example, 


1 1 
| x’ dx = lim x" da, 
0 


a0+ Jaq 


and 


fore) b 
/ xv" dx = lim x" dx, 
1 


boo Jy 


both for r real. 
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When f is integrable, the last theorem can be improved: We have conti- 
nuity of the integral at every point in (a, b). 


Theorem 4.3.5 (Continuity). Suppose that f is integrable over (a,b), and 
set 


t 
FO) =f f(a)ae, a<t<b. 
Then F is continuous on (a, ). 


To see this, fix a < ¢ < 6, and let c, — c—. Applying the previous theorem 
on (a,c), we obtain F'(cn) + F(c). Hence, we obtain continuity of F' from 
the left at every real in (a,b). 

Now let g(x) = f(—x), —b < a < —a, and 


G(t) = i g(x) da, —b<t<-a. 
t 
Since, by additivity, 
—a t 
G(t) =) g(x) da -f g(x) da, —b<t<-a, 
=b —b 


by the previous paragraph applied to g, the function G is continuous from 
the left at every point in (—b,—a). Thus, the function 


Cpe]  Fdee i fG\de=FO, aet<d, 


—t 


is continuous from the right at every point in (a, b). This establishes continuity 
of F on (a,b). 
Our last item is the integral test for positive series. 


Theorem 4.3.6 (Integral Test). Let f : (0,00) + (0,00) be decreasing. 
Then 


nt+l1 
> F() -{ f(z) i (4.3.11) 


exists and 0 < y < f(1). In particular, the integral [> f(x) dx is finite iff 
the sum S-~—_, f(n) converges. 


For each n > 1, let B, = (n,n +1) x (0, f(n)), BL = (n,n + 1) 
x [f(n +1), f(n)], and let G, denote the subgraph of f over (n,n +1) (Fig- 
ure 4.24). Since f is decreasing, Gn C Bn C Gy UB), for alln > 1. Then the 
quantity whose limit is the right side of (4.3.11) equals 


S| [area (Bj) — area (Gy), 


k=1 
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Fig. 4.24 Integral test 


which is clearly increasing with n (here, we used additivity). Hence, the limit 
y => 0 exists. On the other hand, by subadditivity, we get 


area (B,) — area(G,) < area (By) = f(k) — f(k +1). 


So 
7 = >) [area (By) — area(Gx)] < SOLf(k) — f(k+1)] = f()- 
n=1 k=1 


Thus, 7 < f(1). Ifeither [7° f(x) dx or °°, f(n) is finite, (4.3.11) simplifies 


to 
v= S04 (n)- | fla) der, 


n=1 


which shows that the sum is finite iff the integral is finite. 


For a < b we define 
a b 
i. flayde =— f f(x) dx. 


This is useful in 4.3.13 below and elsewhere. 


Exercises 


4.3.1. Show that f>° f(kx)a~'da = J5° f(x)a~! da for k > 0 and f(x)/x 
nonnegative or integrable over (0, 00). 


4.3.2. Show that [_,’ f(—«) dx = i f(x) dx for f nonnegative or integrable 
over (a,b). 


4.3.3. Let f : [a,b] > R be continuous. If a = 1 < 41 < +--+: <a, =bisa 
partition of [a,b], a Riemann sum corresponding to this partition is the real 
(Figure 4.25) 
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x f(a?) (ai — 2i-1), 


where at is arbitrarily chosen in (#;-1,2;),i=1,...,n. Let [= i f(a) da. 


Show that, for every « > 0, there is a 6 > 0, such that 


<e (4.3.12) 


Dm 2 f(a?) (wi — 24-1) 


for any partition a = rp < 21 < ++: <a, = 0 of mesh less than 6 and choice 
of points at, ...,2#. (Approximate f by a piecewise constant f, as in §2.3.) 


4.3.4. Let f : (0,1) > R be given by 


x if x irrational, 
f(z) = j é 
0 if x rational. 


Compute i f(a) da. 


4.3.5. Let f : (a,b) + R be nonnegative, and suppose that g : (a,b) > R is 
nonnegative and piecewise constant. Use additivity to show that 


[ue + g(x)] dx = [ f(x) w+ foto ae 


(First, do this for g constant.) 


Fig. 4.25 Riemann sums 


4.3.6. Let f : (0,00) > R be nonnegative and equal to a constant cy, on each 
subinterval (n —1,n) for n =1,2,.... Then 


[ foar= Yew 


ee) 
n=1 


Instead, if f is integrable, then > 
equality holds. 


Cn is absolutely convergent and the 
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4.3.7. A function f : (a,b) > R is Riemann integrable over (a,b) if there is 
areal J satisfying the following property: For all e€ > 0, there is a 6 > 0, such 
that (4.3.12) holds for any partition a = 1 < 4 <--: < @, = b of mesh less 
than 6 and choice of intermediate points a®, ...,a2%. Thus, Exercise 4.3.3 
says every function continuous on a compact interval [a, b] is Riemann inte- 
grable over (a,b). Let f(a) = 0 for « € Q and f(x) = 1 for x ¢ Q. Show that 
this f is not Riemann integrable over (0, 1). 


4.3.8. Let g : (0,00) > (0,00) be decreasing and bounded. Show that 

jim. é d9(0) = i; g(x) da. 
(Apply the integral test to f(a) = g(xd).) 
4.3.9. Let f : (—b,b) + R be nonnegative or integrable. If f is even, 
then [, f(x) dx = 2) f(x) dx. Now let f be integrable. If f is odd, then 
fy f(a) dx = 0. 
4.3.10. Show that [°° e~ “| dx < oo for a > 0. 
4.3.11. If f : R — R is superlinear (Exercise 2.3.20) and continuous, the 
Laplace transform 

L(s) = / ee F() dr 

is finite for all s € R. (Write heels = ioe a + jie for appropriate a, b.) 


4.3.12. A function 6: R > R is a Dirac delta function if it is nonnegative 
and satisfies 


, ” 5(x) f(a) de = F(0) (4.3.13) 


for all continuous nonnegative f : R — R. Show that there is no such 
function. (Construct continuous f’s which take on the two values 0 or 1 on 
most or all of R, and insert them into (4.3.13).) 


4.3.13. If f is convex on (a,b) anda <c—6<c<c+6 < b, then (Exer- 
cise 3.3.7) 


fle+ 8) - f= #F,(06> | | fi.(a) de. (4.3.14) 


+6) 


Here + means there are two cases, either all +s or all —s. Use this to conclude 
that if f is convex on an open interval containing [a, b], then 


b b 
f(b) - F(a) = f fi.(ayae = f fl (x) dz. 


(Break [a,b] into an evenly spaced partition a = 1% < 11 <-:: < 2% =), 
Xj — %j-1 = 0, and apply (4.3.14) at each point c = 2;.) 
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4.3.14. With 


show that F'(co) exists. Conclude that F' : (0,00) + R is bounded. This is a 
special case of Exercise 4.4.31. 
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By constructing appropriate covers, Archimedes was able to compute areas 
and integrals in certain situations. For example, he knew that i. a* dx = 1/3. 
On the other hand, Archimedes was also able to compute tangent lines to 
certain curves and surfaces. However, he apparently had no idea that these 
two processes were intimately related, through the fundamental theorem of 
calculus. It was the discovery of the fundamental theorem, in the seventeenth 
century, that turned the computation of areas from a mystery to a simple 
and straightforward reality. 

In this section, all functions will be continuous. Since we will use f+ and 
f7 repeatedly, it is important to note that (§2.3) a function is continuous iff 
both its positive and negative parts are continuous. 

Let f be continuous on (a, b), and let [c,d] be a compact subinterval. Since 
(§2.3) continuous functions map compact intervals to compact intervals, f is 
bounded on [c, d], hence integrable over (c, d). 

Let f be continuous on (a,b), fix a <c < b, and set 


x 


meee 
lef 


By the previous paragraph, F.(«) is finite for alla < a < b. From the 
previous section, we know that F, is continuous. Here we show that Fy is 
differentiable and F’(x) = f(x) on (a,b) (Figure 4.26). We will need the 
modulus of continuity 4. (§2.3) of f at 2. To begin, by additivity, F.(y) — 
F(a) = F,(y) — F,(a) for any two points x, y in (a,b), whether they are to 
the right or the left of c. 

Then fora <a<t<y<b, |f(t)— f(x) < us(y— 2). Thus, f(t) < 
f(x) + ue(y — x). Hence, 


Fe(y) — Felt) _ Fe(y) — F(x) 
Y—-az 


y-2x 
= [soa 
y 


yx 
1 


(t)dt, cx<a<b, 


t 
f(jdt, a<ac<e. 


IA 


—— f Fe) + Holy 2)] at = $2) + Holy 2), 
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Similarly, sincea<a<t<y <b implies f(x) — usz(y— 2) < f(t), 


F.(y) = Flt) 


= 2 f(a) — Holy — 2). 


Combining the last two inequalities, we obtain 


Y—-az 


fora<a<y<b.lIfa<y<ua<b, repeating the same steps yields 


ee — Fela) _ f(x)| < we(x — y). 
yrds 
Hence, ifa<xf#y<b, 
ae f(x)| < molly - 2), 


which implies, by continuity of f at x, 


lie F.(y) = F(x) 


yoru y-—x 


= f(z). 


Hence, F’(x) = f(x). We have established the following result, first men- 
tioned in §3.7. 


Fig. 4.26 The derivative at x of the integral of f is f(x) 


Theorem 4.4.1. Every continuous f : (a,b) > R has a primitive on (a,b). 


When f is continuous and integrable on (a,b), we can do better. 


Theorem 4.4.2 (First Fundamental Theorem of Calculus). Let f be 
continuous and integrable on (a,b). Then 
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implies 
ga fe). ee <b, 
and 
b 
F(x) =I f(jj)dt, a<a<b, 
implies 


Ma) =-f(H, a<e<. 


To see this, for the first implication, write [” f(t) dt = f° f(t) dt + F.(2), 
and use FY(x) = f(x). Since, by additivity, [” f(t) dt + ic f(t) dt equals the 
constant r f(a) dx, the second implication follows. 

For example, if 


tan 6 
F(9) = | et dt, 029 <—, 
0 2 


then F’(@) = e~ tn” sec? @ by the above theorem combined with the chain 
rule. We will need this in §5.4. 

The above result begs an answer to a broader question. Let f : (a,b) ~ R 
be an arbitrary—not necessarily continuous—integrable function. We know 
from §4.3 that F' is continuous. We also know F is differentiable and F’ = f 
on (a,b) when f is continuous. Are there any other integrable functions f for 
which this is so? It turns out that F is differentiable and F’ = f almost 
everywhere on (a, b) iff f is measurable.® 

The last two results show that integrals yield primitives. This is one version 
of the Fundamental Theorem of Calculus. The other version of the fundamen- 
tal theorem states that primitives yield integrals. When one is seeking areas 
or integrals, it is this version that is all important. 


Theorem 4.4.3 (Second Fundamental Theorem of Calculus). Let f 
be nonnegative or integrable over (a,b). Suppose f is continuous and let F' be 
any primitive of f on (a,b). Then F(b—) and F(a+) exist, and 


b 
/ f(x) dx = F(b—) — F(a+). 


To see this, first, assume that f is nonnegative. Then F is increasing (F’ = 
f > 0). Hence, F(b—) and F(a+) exist for any primitive F’. In particular, 
with F. as above, F.(b—) and F.(a+) exist. Since F, — F =k is a constant, 
by continuity at the endpoints, 


® This generalization is derived in §6.6. 
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[soe [sears [ seae 
b-1/n 


lim f(x) dx+ lim f(x) dx 
inf feet im fre 


- — dim Fela +1/n)+ Es F.(b— 1/n) 


l 


= F.(b—) — F.(a+) 
= (F(b-) + k) — (F(at+) +k) = F(b-) — F(at). 


For the integrable case, let F'+ denote primitives of f* (here, F* are not the 
positive and negative parts of F). Then F'*+ — F~ differs from any primitive 
F of f by a constant k. Since F*(b—) and F’*(a+) exist, so do F(b—) and 
F (a+). Hence, 


[46 Jar= frre yar— fe 


= F*(b—) — Ft (a+) — F- (b-) + F’ (at) 
= F(b-)+k— F(at+)—k = F(b-) — F(a). 


Note that, in the second fundamental theorem, as stated above, a or b or 
F'(a+) or F(b—) may be infinite. This result is generalized to noncontinuous 
functions in §6.3. 

When a, b, F(b—), and F'(a+) are all finite, the second fundamental the- 
orem simplifies slightly. Indeed, in this case, by defining F'(b) = F(b—), 
F(a) = F(a+), the primitive F' extends to a continuous function on the 
compact interval [a,b] and the fundamental theorem becomes 


b 
| f(a) dx = F(b) — F(a). 


In particular, this simpler form of the fundamental theorem applies when f 
and (a,b) are both bounded. All primitives displayed below were obtained 
in §3.7. 

For example, sinx is bounded and has the primitive — cosx on (0,77). So 


| sin x dz = (— cos) — (—cos0) = 2 
0 


Similarly, x”, n > 0, is bounded and has the primitive 2”*1/(n +1) over any 
bounded interval (a, b), so 


b n+l n+l 
/ Pes oe 3 Ae (4.4.1) 
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In particular, x?” is nonnegative, hence (4.4.1) also holds when a or b are 
g 


infinite. For example, 


i. an i oo2nt1 7 (—o0)?" +1 
236% 2n+ 1 2n+1 


is perfectly valid. 
Below, it is convenient to denote F'(b) — F(a) = F(a)|®. Since, in §3.7, a 
primitive of f was written [ f(a) dx, the fundamental theorem becomes 


[tou- [fae 


This explains the notation [ f(x) da for primitives. (The notation A f(x) dx 
for integrals was explained in §4.3.) 

Also f(a) =1/V1— 2? > 0 has the primitive F(a) = arcsinx continuous 
over [—1, 1], so 


b 
a 


[ ae arcsin | arcsin 1 — arcsin(—1) = 7 
= arcsinz|_, = arcsin1 — —l)=r. 
—-1V 1- ae? = 
Similarly, since f(x) = 1/(1 + 2?) is nonnegative and has the primitive 
F(a) = arctan over R, 


~~ dx a T T 
~14te = arctanz|"_, = oa (-4) =T. 


The unit disk 


D={(a,y):27 +y? <1} 


is the disjoint union of a horizontal line segment and the two half-disks 


D* = {(2,y): 27 +y? <1,+y > 0}. 


Then area(D) = 2- area(Dt) (Exercise 4.2.11). But Dt is the subgraph 
of f(x) = V1 — 2? over (—1,1), which has a primitive continuous on [—1, 1]. 


Hence, : i 
1 
/ V1— 22 dzr = 5 (aresinx t+aV1— 7) 
=1 


-1 


7 


5" 


This yields the following. 


Theorem 4.4.4. The area of the unit disk is 7. 


Of course, by translation and dilation invariance, the area of any disk of 
radius r > 0 is rr?. Another integral is 


1 
=> => f= i = —d 
| (— log x) dx = (x — xlogz)|, 1+ lim slog e 1+0=1. 


Our next item is the linearity of the integral. 
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Theorem 4.4.5 (Linearity). Suppose that f, g are continuous on (a,b). 
If f and g are both nonnegative or both integrable over (a,b), then 


[ve + g(x)] dx = [ f(x) txt [ato de, 


To see this, let F' and G be primitives corresponding to f and g. Then 
ftg =F'4+G@ = (F+G). So F+G is a primitive of f + g. By the 
fundamental theorem, 


b 
[ lee) + olde = FO) + G6) ~ Flat) - Gat) 


= [sede f otayae 


More generally, linearity is valid when f and g are both nonnegative and f 
or g is continuous (Exercise 4.4.24). To what extend is linearity valid when 
f or g are signed? Since integrals are defined in terms of areas, the scope 
of the validity of the linearity of integrals is intimately connected with the 
scope of the validity of additivity of areas 


area(AU B) = area(A) + area(B) 


for disjoint sets A and B. This leads to measurable sets, discussed in §4.5, 
and measurable functions, discussed in §6.1. 
We say f : (a,b) > R is piecewise continuous if there is a partition 


a= 2% <2 <-++: <a» = 5, such that f is continuous on each subinterval 
(a;-1,0;), 7 =1,...,n. Now by additivity, the integral fe can be broken up 
into ie i=1,...,n. We conclude that additivity also holds for piecewise 


continuous functions. 

By induction, additivity holds for finitely many (piecewise) continuous 
functions. If fi,..., fn are (piecewise) continuous and all nonnegative or all 
integrable over (a,b), then 


n 


bn b 
| S- f(a) dx = S- f(a) dx. 


k=1 k=1"% 


Since primitives are connected to integrals by the fundamental theorem, 
there is an integration by parts (§3.7) result for integrals. 


Theorem 4.4.6 (Integration by Parts). Let f and g be differentiable on 
(a,b) with f’g and fg’ continuous. If f’g and fg’ are both nonnegative or 
both integrable, then 


b 
[ fos @)ae = Fla)g(0) 


b— b 
— fH @g(e) ax. 
a+ a 
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This follows by applying the fundamental theorem to f’g + fg’ = (fg) 
and using linearity. 

Since primitives are connected to integrals by the fundamental theorem, 
there is a substitution (§3.7) result for integrals. Recall (§2.3) that continuous 
strictly monotone functions map open intervals to open intervals. 


Theorem 4.4.7 (Substitution). Let g be differentiable and strictly mono- 
tone on an interval (a,b) with g' continuous, and let (m,M) = g[(a, b)]. Let 
f:(m,M) > R be continuous. If f is nonnegative or integrable over (m, M), 
then f|g(t)]|g'(t)| is nonnegative or integrable over (a,b), and 


M b 
/ f(a) de = / flo(é)]lg/(t)| at. (4.4.2) 


To see this, first, assume that g is strictly increasing and f is nonnegative; 
let F' be a primitive of f, let H(t) = F[g(t)], and let h(t) = f[g(t)]g’(t). Then 
(m,M) = (g(a+),9(b—)) and H'(t) = F’[g(t)lg'(t) = flg@)]g/(t) = h(t) by 
the chain rule. Hence, H is a primitive for h. Moreover, h is continuous and 
nonnegative, f(M—) = H(b—), and F(m+) = H(a+). By the fundamental 
theorem, 


Since |g'(t)| = g’(t), this establishes the case with g strictly increasing and 
f nonnegative. If f is integrable, apply the nonnegative case to f+. Since 
the positive and negative parts of f[g(t)]g’(t) are f*[g(t)]g’ (t), the integrable 
case follows. 

If g is strictly decreasing, then (m, M) = (g(b—), g(a+)). Now h(t) = g(-t) 
is strictly increasing, h((—b, —a)) = (m, M), and h’(—t) = —q'(t) = |g’ (t)| is 
nonnegative on (a, b). Applying what we just learned to f and h over (—b, —a) 
yields 
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If g is not monotone, then (4.4.2) has to be reformulated (Exercise 4.4.25). 
To see what happens, let us consider a simple example with f(a) = 1. Let 
g: (a,b) > (m, M) be piecewise linear with line segments inclined at +7/4. 
By this, we mean g is continuous on (a,b) and the graph of g is a line segment 
with slope +1 on each subinterval (tj-1,t;), 7 = 1,...,n, for some partition 
a=to <t) <-:: < tn, =b of (a,b) (Figure 4.27). Then |g/(t)| = 1 for all 
but finitely many t, so is |g’ (£)| dt = b — a. On the other hand, substituting 
f(x) = 1 in (4.4.2) gives rp |g’ (t)| dt = M —m. Thus, in such a situation, 
(4.4.2) cannot be correct unless the domain and the range have the same 
length, i.e., M—m=b-a. 

To fix this, we have to take into account the extent to which g is not a 
bijection. To this end, for each x in (m, M), let #(x) denote the number of 
points in the inverse image g~!({x}). Since (m, M) is the range of g, #(x) > 1 
for all m < x < M. The correct replacement’? for (4.4.2) with f(z) = 1 is 


M b 
#(0) dx = [ |g’ (t)| dt. (4.4.3) 


This holds as long as g is continuous on (a,b) and there is a partition 
a=ty <t, <-:: < t, = 0 of (a,b) with g differentiable, g’ continuous, 
and g strictly monotone on each subinterval (¢;-1,¢;), for each i = 1,...,n 
(Exercise 4.4.25). For example, supposing g : (a,b) + (m, M) piecewise lin- 
ear with slopes +1 reduces (4.4.3) to fio #(x) dx = b—a. Dividing by M—m 
yields 


1 ca b-a 
= ‘ 4.4.4 
wo f toe- Fs (4.4.4) 
Now the left side of (4.4.4) may be thought of as the average value of #() 
over (m, M). We conclude that, for a piecewise linear g with slopes +1, the 
average value of the number of inverse images equals the ratio of the lengths 
of the domain over the range. 


Fig. 4.27 Piecewise linear: #(x) = 4, #(x’) =3 


10 This is generalized to any continuous g in Theorem 6.6.4. 
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Now we derive the integral version of 


Theorem 4.4.8 (Taylor’s Theorem). Let n > 0 and suppose that f is 
(n+1) times differentiable on (a,b), with f+ continuous on (a,b). Suppose 
that f+) is nonnegative or integrable over (a,b), and fiza<c<« <b. 
Then 


f(ixi)=fld + f'(d\(a@-e)+ ED fp 2 +... 


2! 
fe) n , lnti(#) 
> n! ae ae 1)! 


( _ et 


? 


where 
An4i() = (n+ » f (1—s)" ft [e+ s(x — )] ds. 


To see this, recall, in §3.5, that we obtained R,+41(x,x) = 0 and (here, ’ 
denotes derivative with respect to t) 


(n+1) 
salast) = oe — ay 


Now apply the fundamental theorem to —R},,,(a,t) and substitute t = ¢ + 
s(x —c), dt = (a — c)ds, obtaining 


Ra+i(x,¢) = = | fre aide 


g=—~ert ft 
= Sa ff rotPet se ant ~s)"as 


_ (x _ ayer? 
(n+1)! 


An+1 (x). 


In contrast with the Lagrange and Cauchy forms (§3.5) of the remainder, 
here, we need continuity and nonnegativity or integrability of f+). 

Our last item is the integration of power series. Since we already know 
(§3.7) how to find primitives of power series, the fundamental theorem and 
(4.4.1) yield the following. 


Theorem 4.4.9. Suppose that R > 0 is the radius of convergence of 


f(x) = >» Ant”. 
n=0 


If [a,b] C (—R, R), then 


[ f(a) dx = > i, an x” de. (4.4.5) 
P cor 
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For example, substituting —a? for x in the exponential series, 


Integrating this over (0,1), we obtain 


feteni-4e? ne 
i; OSCO””:CSB Se a 


This last result is, in general, false if a = —R or b = R. For example, 
with f(x) = e-? = yr. ,(-1)"2"/n! and (a,b) = (0,00), (4.4.5) reads 1 = 
oo — 00 +00 — 00 +... . Under additional assumptions, however, (4.4.5) is 


true, even in these cases (see §5.2). 


Exercises 


4.4.1. Compute [5° e~** dx for s > 0. 


4.4.2. Compute i, a”—\ da and f° 2"—1 dx and f>° «”—' dz for r real. (There 
are three cases, r < 0, r = 0, and r > 0.) 


4.4.3. Suppose that f is continuous over (a,b), and let F’ be any primitive. 
If f and (a,b) are both bounded, then f is integrable, and F'(a+) and F'(b—) 
are finite. 


4.4.4. Let f be continuous on [1,0o) and differentiable on (1,00) with f’ 
continuous and nonnegative over (1, 00). Show f (oo) exists iff f’ is integrable 
over (1, 00). 


4.4.5. Let f be continuous on [1,0o) and differentiable on (1,00) with f’ 
continuous, decreasing, and nonnegative over (1,00). Show that 


> f'(n) < 00 


n=1 


S> £'(n) 
= F(n) < Ow. 


(Use the integral test.) 


4.4.6. Let f(x) = sina/xz, x > 0, and let F(b) = fb f(x) dx, b > 0. Show 
that F'(co) = limp... F(b) exists and is finite. (Write F(b) = i, f(a) dx + 
f f(a) dx, integrate the second integral by parts, and use (4.3.1). This limit 
is computed in Exercise 5.4.12.) 
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4.4.7. For f continuous and nonnegative or integrable over (0,1), 


fe)de = [ o-* peat. 
0 0 


4.4.8. Compute te e **a" da for s > 0 and n > 0. (Integration by parts.) 


4.4.9. Compute [5° e~”® sin(sa) da and [5° e~"* cos(sa) dx for n > 1. (Inte- 
gration by parts.) 


4.4.10. Show that [°° e~*/2t* dt = (a — 1) fo° e~*/2t*-? dt for x > 1. Use 
this to derive 


co 
| eo /242n41 dt = 2” nl, n> 0. 
0 
(Integration by parts.) 


4.4.11. Compute fa — t)"t®-1 dt for x > 0 and n > 1. (Integration by 
parts.) 


4.4.12. Compute AS log x)” dz. 


4.4.13. Show that 


ia n dn — n 2n-(2n—2)----- 2 
[@t-prae = (yn 


(Integrate by parts.) 


4.4.14. For n > 0, the Legendre polynomial P,, (of degree n) is given by 
P, (a) = f™ (x) /2"n!, where f(x) = (x? — 1)”. Show that 


2 
Pr( 
[. ~ n+l +1 
4.4.15. If f is a polynomial, show 
| f(x)sina dx = F(0) + F(z) 
0 


where F(x) = f(x) — f" (2) + f(a) — ... (Exercise 3.7.11). 


4.4.16. If 7 = a/b is rational, let 


In =i, Gn(x) sin x da, n> 1, 
0 
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where gp is defined in Exercise 3.3.30. Show that: 


e I, >0 
e I,, is an integer 
e I, ~O0asn—> oo. 


Conclude that 7 is irrational. 


4.4.17. Use the integral test (§4.3) to show that 


(=> =, o>, 


converges. 


4.4.18. Use the integral test (§4.3) to show that 


1 1 1 
= li 1l+=4+=4+-:-4--1 
Y Jim (14545 + ca ogn) 
exists and 0 < y < 1. This particular real y is Euler’s constant. 


4.4.19. Compute a x cos(nx) dx and jie xsin(nx) dx for n > 0. (Integra- 
tion by parts.) 


4.4.20. Compute [”_ f(nx)g(mx) dz, n,m > 0, with f(x) and g(x) equal to 
sin x or cosa (three possibilities—use (3.6.3)). 


4.4.21. If f,g : (a,b) ~ R are nonnegative and continuous, derive the 
Cauchy-Schwarz inequality 


foe sme [ture 


(Use the fact that q(t 
polynomial and Nene 1. 


ol ) + tg(x))|? dx is a nonnegative quadratic 


4.4.22. For n > 1, show that 


"1—(1—t/n)" 1 1 
/ Ot alt tects. 
0 


t n 


4.4.23. We say f : (a,b) > R is piecewise differentiable if there is a par- 


tition a = % < 41 <-+:: < &, = 0, such that f restricted to (x;_1,2;) is 
differentiable for i =1,...,n. Let f : (a,b) > R be piecewise continuous and 
integrable. Show that F(a =| f(t) dt, a < x < b, is continuous on (a, b) 


and piecewise Nae on (a, 6). 
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4.4.24. If f : (a,b) > R is nonnegative and g: [a,b] > R is nonnegative and 
continuous, then 


[ve + g(x)] dx = [ f(x) w+ foo ie. 


(Use Exercise 4.3.5 and approximate g by a piecewise constant g. as in §2.3. 
Since f is arbitrary, linearity may not be used directly.) 


4.4.25. Suppose that g : (a,b)  (m,M) is continuous, and suppose that 
there is a partition a = to < ty <--: < t, = b of (a,b), such that g is 
differentiable, g’ is continuous, and g is strictly monotone on each subinterval 
(t;-1,t;:), for each ¢ = 1,...,n. For each x in (m,M), let #(x) denote the 
number of points in the inverse image g~'({x}). Also let f : (m,M) > R be 
continuous and nonnegative. Then'! 


" Fa x) dx = [0 flo(t)]lg’ (6)| at. (4.4.6) 


(Use additivity on the integral f) 


4.4.26. Let f be differentiable with f’ continuous on (a,b). Show (Exer- 
cise 2.2.4) vy(a,b) is bounded by i. | f’(x)|dx. Use Exercise 4.3.3 to show 


vs (a, b) = Lf" (a)| de. 


(Restrict to a compact subinterval [c,d] C (a,b) and rewrite the variation of 
f over a given partition as a Riemann sum for ||.) 


4.4.27. Let f be a differentiable function on (a,b). The length of the graph 
of f over (a,b) is defined by the formula 


[ V1+ f'(x)? de. (4.4.7) 


Apply this formula to show that the length p of the upper-half unit circle 
y=vV1l—27,-1<a< 1, equals 


=T. 


I V1—2? 
Show directly, without using any trigonometry, that the integral is finite and 
in fact between 0 and 4. (Use x? < x on (0, 1)). 


4.4.28. Let (x, y) = (cos 6, sin @) be on the unit circle and assume 0 < 6 < 2r. 
If y > 0, the length L of the counterclockwise circular arc joining (1,0) to 


11 (4.4.6) is actually valid under general conditions; see §6.6. 
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(x,y) is defined by (4.4.7). If y < 0, given that the length of the upper-half 
unit circle is 7, it is natural to define the length L of the counterclockwise 
circular arc joining (1,0) to (a,y) as 7 + L’, where L’ is the length of the 
counterclockwise circular arc joining (—1,0) to (#,y). Show that in either 
case L = 6 (Figure 3.13). In particular, this shows the length of the unit 
circle equals 27 (Figure 4.28). 


(x,y) 


(1,0) 


Fig. 4.28 Geometric definition of the inverse cosine function 


4.4.29. This problem shows how one can define 7, sin, cos directly and geo- 
metrically from the unit circle. Given a point (x,y), y > 0, on the upper-half 
unit circle x? + y? = 1, we define the angle corresponding to (x,y) to be the 
length of the circular arc lying above the interval (2,1), 


*, gt 
a(x) = f ——,, -l<r<l. 
ae 
By Exercise 4.4.28, this agrees with the definition in §3.6. Note that 0(x) < 
6(—1) = p is less than 4 (Exercise 4.4.27). 


. Show that 6’(a) = -1/V/1—27, -l<a<l. 

. Show that @ : [—1, 1] > [0, p] is continuous on [—1, 1] and strictly decreas- 
ing, hence by the IFT, has an inverse. 

. Let c: [0,p] — [-1,1] be the inverse of 6. Show that c’ = —V1-—c? on 
(0, p). 

. Let s = V1 —c?. Show that s’ = c and c’ = —s on (0,p). 


Of course, p equals 7 and the functions c, s are the functions cos, sin, and this 
approach may be used as the basis for §3.6. 


o AQ we 


4.4.30. Let f : (—R,R) > R be smooth and suppose there is an integrable 
g: (—R, R) > R such that for |c| < R andd< R-— |d, 


L@ |S <g(z)  nz1e-d<a 


Show that the Taylor series of f centered at c converges to f on |a —c| < d. 
Compare with Exercise 3.5.8. 
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4.4.31. Let f : (0,00) — R be differentiable with f’ : (0,00) + R con- 
tinuous. Assume f is decreasing to zero as  — oo. Let g : (0,00) > R 
be continuous with bounded primitive G on (0,00). If both fg and f’G are 
integrable over (0,b) for all b > 0, then 


b 
im f f(x)g(a) dx 


boo 


exists. Use integration by parts. 


4.5 The Method of Exhaustion 


In this section, we compute the area of the unit disk D via the method of 
exhaustion. The Method of Exhaustion implies the Monotone Convergence 
Theorem (Theorem 5.1.2), a key building block in the results of Chapter 5. 

For n > 3, let Py, = (cos(27k/n), sin(2rk/n)), 0< k <n. Then the points 
P, are evenly spaced about the unit circle {(a, y) : z?+y? = 1}, and P, = Pp. 
Let D, C D be the interior of the inscribed regular n-sided polygon obtained 
by joining the points Po, Pi,...,P, (we do not include the edges of D,, in 
the definition of D,,). Then (Exercise 4.2.13), 


_sin(27/n) 


area (D,,) = 5 sin(2x/n) =T an] 


. . sin x . . 
Since lim,_,9 —— = sin’ 0 = cos0 = 1, we obtain 
x 


lim area(D,) =. (4.5.1) 
noo 
Since - 
DsCDgCDieC..., and D=(|) Da, (4.5.2) 
n=2 


it is reasonable to make the guess that 


area(D) = lim area(Don), (4.5.3) 


noo 


and hence conclude that area(D) = 7. The reasoning that leads from (4.5.2) 
to (4.5.3) is generally correct. The result is called the method of exhaustion. 
Although area(D) was computed in the previous section using the fun- 
damental theorem, in Chapter 5 we will need the method to compute other 
areas. 
We say that a sequence of sets (A,) is increasing (Figure 4.29) if Ay C 
Ag C Agc.... 
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A3 


Fig. 4.29 An increasing sequence of sets 


Theorem 4.5.1 (Method of Exhaustion). If A; C Az C ... is an inc- 
reasing sequence of subsets of R?, then 


area (U 4] = _ area(A,). 


n=1 


We warn the reader that the result is false, in general, for decreas- 
ing sequences. For example, take A, = (n,0oo) x (—o0,0o), n > 1. Then 
area (A,) = 00 for all n > 1, but (V7, An = 9 so area(()7_, An) = 0. This 
lack of symmetry between increasing and decreasing sequences is a reflection 
of the lack of symmetry in the definition of area: area (A) is defined as an inf 
of overestimates, not as a sup of underestimates. 

The derivation of the method is not as compelling as the applications in 
Chapter 5. Moreover, the techniques used in this derivation are not used else- 
where in the text. Because of this, the reader may wish to skip the derivation 
on first reading and come back to it after progressing further. 

The method is established in three stages: first, when (definitions below) 
the sets A,, nm > 1, are open; then when the sets A,, n > 1, are interopen; 
and, finally, for arbitrary sets A,, n > 1. Open and interopen are structural 
properties of sets that we describe below. 

We call a set G C R? open if every point (a,b) € G can be surrounded 
by a nonempty, open rectangle wholly contained in G. For example, an open 
rectangle is an open set, but a compact rectangle Q is not, since no point on 
the edges of @ can be surrounded by a rectangle wholly contained in Q. The 
n-sided polygon D,,, considered above, is an open set as is the unit disk D. 
Since there are no points in @ for which the open criterion fails, @ is open. 

For our purposes, the most important example of an open set is given by 
the following. 


Theorem 4.5.2. If f > 0 is continuous on (a,b), its subgraph is an open 
subset of R?. 


To see this, pick # and y with a <  < band 0 < y < f(a). We have 
to find a rectangle Q containing (a, y) and contained in the subgraph. Pick 
y < yi < f(x). We claim there is a c > 0, such that |t — 2] < c implies 
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a<t< band f(t) > y. If not, then for all n > 1, we can find a real t,, in the 
interval (a — 1/n,a2+1/n) contradicting the stated property, i.e., satisfying 
f(tn) < yi. Then t, — x, so by continuity f(tn) > f(x). Hence, f(x) < yi, 
contradicting our initial choice of y;. Thus, there is a c > 0, such that the 
rectangle Q = (x —c,x+c) x (0,y1) contains (x,y) and lies in the subgraph. 


Thus, the integral of a continuous nonnegative function is the area of an 
open set. 

An alternative description of open sets is in terms of distance. If (a,b) 
is a point and A is a set, then the distance d((a,b), A) between the point 
(a,b) and A, by definition, is the distance between the set {(a,b)} and the 
set A (§4.2). For example, if Q is an open rectangle and (a,b) € Q, then 
d((a,b), Q°) is positive. Here and below, A° = R? \ A. 


Theorem 4.5.3. A set G is open iff d((a, 6), G°) > 0 for all points (a,b) € G. 


Indeed, if (a,b) € G and Q C G contains (a,b), then Q° D G*, so 
d((a,b),G°) > d((a,b),Q°) > 0. Conversely, if d = d((a,b),G*°) > 0, then 
the disk R of radius d/2 and center (a,b) lies wholly in G. Now choose any 
rectangle Q in R containing (a, b). 

If G, G’ are open subsets, so are GUG’ and GNG’. In fact, if (Gy) isa 
sequence of open sets, then G = ~~, Gn is open. To see this, if (a,b) € G, 
then (a,b) € G, for some specific n. Since the specific G,, is open, there is a 
rectangle Q with (a,b) € Q C G, C G. Hence, G is open. Thus, an infinite 
union of open sets is open. If G,,...,G, are finitely many open sets, then 
G = Gi{NG2N...AG,, is open. To see this, if (a,b) € G, then (a,b) € G, for all 
1<k <n, so there are open rectangles Q, with (a,b) € Q, C Gp, 1 <k <n. 
Hence, Q = ();_; Qx is an open rectangle containing (a,b) and contained in 
G (a finite intersection of open rectangles is an open rectangle). Thus, a finite 
intersection of open sets is open. However, an infinite intersection of open sets 
need not be open. 

If A Cc R? is any set and € > 0, by definition of area, we can find an open set 
G containing A and satisfying area(G) < area(A) +e (Exercise 4.5.6). If we 
had additivity and area(A) < 00, writing area (G) = area (A) + area(G \ A), 
we would conclude that area(G \ A) < €. Conversely, if we are seeking prop- 
erties of sets that guarantee additivity, we may, instead, focus on those sets 
M in R? satisfying the above approximability condition: For all € > 0, there 
is an open superset G of M, such that area(G \ M) < e. Instead of doing 
this, however, it will be quicker for us to start with an alternate equivalent 
(Exercise 4.5.15) formulation. 

We say a set M Cc R? is measurable if 


area(A) = area(AM M)+area(AN M*), for all AC R?. (4.5.4) 


For example, the empty set is measurable, and M is measurable iff M° is 
measurable. Below, we show that every open set is measurable. Measurabil- 
ity may be looked upon as a strengthened form of additivity, since the equality 
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in (4.5.4) is required to hold for every A C R?. Note that the trick, below, of 
summing alternate areas Cy, C3,Cs5,... was already used in derivating addi- 
tivity in Theorem 4.3.3. Compare the next derivation with that derivation! 

In §4.2, we established additivity when the sets were well separated. Now 
we establish a similar result involving open sets. 


Theorem 4.5.4. If G is open, then G is measurable. 
To see this, we need show only that 
area (A) > area(ANG) + area(AN G*) (4.5.5) 


for every A C R?, since the reverse inequality follows by subadditivity. Let 
AC R’ be arbitrary. If area(A) = oo, (4.5.5) is immediate, so let us assume 
that area(A) < oo. Let Gy, be the set of points in G whose distance from G° 
is at least 1/n. Since AN G,, and AN G* are well separated (Figure 4.30), 


area(A) > area(ANG,) + area(ANG*). 
By subadditivity, 
area(AMG) < area(ANG,) + area(ANGNG*). 
Combining the last two inequalities, we obtain 
area(A) > area(ANG) +area(ANG*)—area(ANGNG®). (4.5.6) 
Thus, if we show that 


lim area(ANGNG°) =0, (4.5.7) 


letting n 7 00 in (4.5.6), we obtain (4.5.5), hence the result. 

To obtain (4.5.7), let C, be the set of points (a,b) in G satisfying 
1/(n+1) < d((a,b),G°) < 1/n. Since G is open, d((a,b),G*°) > 0 for ev- 
ery point in G. Thus, 

GNGe =C,UCn41 UCr42U... . 
But the sets Cy, Cn+i2, Cra, ---, are well separated. Hence, 
area(AMC,,) + area(AN Cni2) + area(AN Cni4) +--+ < area(ANG). 
Since Ch+41, Cni3, Cn+s,---., are well separated, 


area(AM Cn4i) + area(AM Cy+3) + area(AN Chis) +--+: < area(ANG). 


Adding the last two inequalities, by subadditivity, we obtain 


area(ANGNGS) < S- area(AM Cr) <2area(ANG)<oo. (4.5.8) 


k=n 
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Now (4.5.8) with n = 1 shows that the series )77° , area(AM C;,) converges. 
Thus, the tail series, starting from & = n in (4.5.8), approaches zero, as 
n / co. This establishes (4.5.7). 


Fig. 4.30 An open set is measurable 


Now we establish the method for measurable, hence for open, sets. In fact, 
we need to establish a strengthened form of the method for measurable sets. 


Theorem 4.5.5 (Measurable Method of Exhaustion). If M,C Mz C 
. is an increasing sequence of measurable subsets of R? and A Cc R? is 
arbitrary, then 


area 


AN (U us) = lim area(AM M,,). 


n=1 


To derive this, let Moo = Ur, Mn. Since AN M, C AM Moo, by mono- 
tonicity, the sequence (area(AM M,,)) is increasing and bounded above by 
area(AM M..). Thus, 


lim area(ANM M,,) < area(AN M,). 


noo 


To obtain the reverse inequality, apply (4.5.4) with M and A, there, replaced 
by M, and AN Mb respectively, obtaining 


area(AM M2) = area(AM M2 M)) + area(AN M2 Mf). 
Since AN MyzN M, = AN M,, this implies that 
area(AM M2) = area(AM M;) + area(AN M2N M7). 


Now apply (4.5.4) with M and A, there, replaced by Mz and AN M3 respec- 
tively, obtaining 


area (AM M3) = area(AN M2) + area(AN M3 MS) 
= area(AM M;) + area(AN M2. M7) 
+area(AN M39 M3). 
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Proceeding in this manner, we obtain 


area(AN M,,) = area(AN Mj) + oS area (AN M, Mf_,). 
k=2 


Sending n oo, we obtain 


i area(AM M,,) = area(A Nn M,) + S- area (AN MyM Mg_1). 


k=2 
Since 
M, U (M2 My) U (M30 M3) U--- = Mo, 
subadditivity implies that 
area(AM M..) < area(AM M1) + x area (AMM, Mf_,). 
k=2 


Hence, we obtain the reverse inequality 


oe area(AM M,,) > area(An Ma). 


By choosing A = R?, we conclude that the method is valid for measurable, 
hence open sets. This completes stage one of the derivation of the method. 

Next we establish the method for interopen sets. A set I C R? is interopen 
if I is the infinite intersection of a sequence of open sets (G,), J =(\r, Gn- 
Of course, every open set is interopen. Also every compact rectangle is in- 
teropen (Exercise 4.5.5). The key feature of interopen sets is that any set 
A can be covered by some interopen set J, A C I, having the same area, 
area (A) = area (I) (Exercise 4.5.7). 


Theorem 4.5.6. [f (M,,) is a sequence of measurable sets, then (.\7_, My, is 
measurable. 


To derive this theorem, we start with two measurable sets M, N, and we 
show that MM WN is measurable. First, note that 


(Mn N)° =(MNN°)U(M¢N N)U (MeN N®). (4.5.9) 


Let A C R? be arbitrary. Since N is measurable, write (4.5.4) with AN M 
and N replacing A and M, respectively, obtaining 


area(AM M) = area(AN MNN)+area(AN MON’). 


Now write (4.5.4) with AM M* and N replacing A and M respectively, 
obtaining 


area(AM M°) = area(AN M°N N)+area(AN M°N N°). 
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Now insert the last two equalities in (4.5.4). By (4.5.9) and subadditivity, 
we obtain 
area(A) = area(AM M) + area(AN M°) 
= area(AN(MNN))+area(AN(MNN*)) 
+area(An (M°MN)) + area(An (M°N N*)) 
> area(AN(MNN))+area(AN(MNN)°). 


Hence, 
area(A) > area(AN(MNN))+area(AN(MNN)°). 


Since the reverse inequality is an immediate consequence of subadditivity, we 
conclude that IMM N is measurable. 

Now let (1,,) be a sequence of measurable sets and set N, = ia Mg, 
n >1. Then N,,, n > 1, are measurable. Indeed N; = M, is measurable. For 
the inductive step, suppose that N,, is measurable. Since Nj41 = NnAMn+1, 
we conclude that N,,,1 is measurable. Hence, by induction, N,, is measurable 
for all n > 1. Now Moy =(\o_, Mn = (7, Nn and Ni D NQ >... . Hence, 
Nf C N5 C..., 80 by the measurable method, we obtain 


4n(As) | (4.5.10) 


= area (4 n U vs) - tim area(AM N¢). 
n=l n co 


area(AM MS,) = area 


Here we used De Morgan’s law (§1.1). Now for each n > 1, 


area (A) = area(AN N,,) + area(AN N°) 
> area(AN M,,) + area(An N*%). (4.5.11) 


Sending n / oo in (4.5.11) and using (4.5.11) yields 
area (A) > area(AM M,) + area(ANn MS). 


Since the reverse inequality follows from subadditivity, we conclude that 
Moo = (\-_1, Mn is measurable. 

By choosing (M,,) in the theorem to consist of open sets, we see that every 
interopen set is measurable. Hence, we conclude that the method is valid for 
interopen sets. This completes stage two of the derivation of the method. 

The third and final stage of the derivation of the method is to establish it 
for an increasing sequence of arbitrary sets. To this end, let A; C Az C... be 
an arbitrary increasing sequence of sets. For each n > 1, by Exercise 4.5.7, 
choose an interopen set J, containing A, and having the same area: I, > Ap, 
and area (J,) = area(A,,). For each n > 1, let 


Fe Sa Tetitaos Tipe Cass 
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Then J, is interopen, A, C Jpn, C In, and area(J,) = area(A,), for alln > 1. 
Moreover, Jn = In M Jn+1. Hence, (and this is the reason for introducing the 
sequence (J,,)), the sequence (J,,) is increasing. Thus, by applying the method 
for interopen sets, 


lim area(A,,) = lim area (Jy) 


noo 
= area (U i) > area (U 4] : (4.5.12) 
n=1 


n=1 


On the other hand, by monotonicity, the sequence (area (A,,)) is increasing 
and bounded above by area ((J°<_, An). Hence, 


lim area(A,,) < area A, |. 
eee (U 


n=1 


Combining this with (4.5.12), we conclude that 


area (A,,) = area (U 4] : 


This completes stage three, hence the derivation of the method. 

We end by describing the connection between the areas of the inscribed and 
circumscribed polygons of the unit disk D, as the number of sides doubles. 
Let 


_ (eon sin(27k/n) 
© \cos(t/n) * cos(r/n) 


7 O<k<n. 


Then the points P, are evenly spaced about the circle {(x,y) : x? + y? = 
sec?(/n)}, and P, = Po. Let D!, denote the interior of the regular n-sided 
polygon obtained by joining the points Po,...,P, by line segments. Then 
Di, > Dand Di, =c: Dy with c = sec(a/n). Hence, by dilation invariance, 
we obtain 

area (D/,) = c? - area(Dn) = ntan(m/n) 


which also goes to 7 as n 74 oo. 

Let an, a}, denote the areas of the inscribed and circumscribed n-sided 
polygons D,, D/,, respectively. Then using trigonometry, one obtains 
(Exercise 4.5.11) 

Gan = \/ ana! (4.5.13) 


n 


—_ : (— i =) (4.5.14) 


/ 
a2n an 


and 
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Since a4 = 2 and a/, = 4, we obtain ag = 2,/2 and ag = 8(./2 — 1). Thus, 
2/2 <1 < 8(V2—1). 


Continuing in this manner, one obtains approximations to 7. These identities 
are very similar to those leading to Gauss’s arithmetic-geometric mean, which 
we discuss in §5.3. 


Exercises 


4.5.1. If Q is an open rectangle and (x,y) € Q, then d((z,y), Q°) > 0. 
4.5.2. Find a sequence (A,,) of open sets, such that (\°°_, An is not open. 


4.5.3. A set A is closed if A° is open. Show that a compact rectangle is 
closed, an infinite intersection of closed sets is closed, and a finite union of 
closed sets is closed. Find a sequence (A,,) of closed sets, such that Ug An 
is not closed. (You will need De Morgan’s law (§1.1).) 


4.5.4. Given a real a, let DL, denote the vertical infinite line through a, Ly = 
{(z,y): 2 =a,y € R}. Also set Lo = Lx. = 0. Let f be nonnegative and 
continuous on (a,b). Show that 


C={(z,y):a<a<b0<yK< f(x)}ULGUL 


is a closed set and : 
i f(x) dx = area(C). 


This shows that the integral of a continuous nonnegative function is also the 
area of a closed set. (Compare C with the subgraph of f(x) + €/(1 + 27) for 
€ > 0 small.) 


4.5.5. Show that C is closed iff 
d((z,y),C) =0 => (x,y) EC. 


If C is closed and G, = {(z,y) : d((x,y),C) < 1/n}, then G,, is open and 
C=), Gn. Thus, every closed set is interopen. 


4.5.6. Let A C R? be arbitrary. Use the definition of area(A) to show: For 
all « > 0, there is an open superset G of A satisfying area(G) < area(A) +. 
Conclude that 


area (A) = inf{area(G) : A C G, G open}. 


4.5.7. Let A C R? be arbitrary. Show that there is an interopen set I con- 
taining A and having the same area as A (use Exercise 4.5.6). 
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4.5.8. Show that M is measurable iff there is an interopen superset J > M 
satisfying area (I — M) = 0. (use Exercise 4.5.7). 


4.5.9. If (M,) is a sequence of measurable sets, then 7°, Mp is measurable. 
4.5.10. Show that D’, > D. 
4.5.11. Derive (4.5.13) and (4.5.14). 


4.5.12. If A and B are disjoint and A is measurable, then area(AU B) = 
area (A) + area (B). 


4.5.13. If (A,) is a sequence of disjoint measurable sets, then 


area (U 4s) = S- area(A,). 
n=1 n=1 
4.5.14. If A and B are measurable, then area (A U B) = area(A)-+area(B) — 
area(ANM B). 


4.5.15. Show that M is measurable iff, for all « > 0, there is an open superset 
G of M, such that area(G \ M) < e. 


4.5.16. Let A C R? be measurable. If area(A) > 0, there is an € > 0, such 
that area(A NM A’) > 0 for all translates A’ = A+ (a,b) of A with |a| < € and 
|b] < e. (Start with A a rectangle, and use Exercise 4.2.15.) 


4.5.17. N C R? is negligible if area(N) = 0. If N is negligible, then N is 
measurable. 


4.5.18. If A C R? is measurable and area(A) > 0, let 
A-A= {(@ ~~ vy _ y’) : (x, y) and (va) € A} 
be the set of differences. Note that A— A contains the origin. Then for some 


e > 0, A— A must contain the open rectangle Q. = (—e,€) x (—e,€). (Use 
Exercise 4.5.16.) 


Chapter 5 
Applications 


5.1 Euler’s Gamma Function 


In this section, we derive the formula 


* da _ a dl 
9 x ne 
1 1 1 
=atptgt (5.1.1) 


Along the way, we will meet Euler’s gamma function and the monotone 
convergence theorem, both of which play roles in subsequent sections. 
The gamma function is defined by 


I(2) =| ee di, xz>0. 
0 


Clearly I(x) is positive for « > 0. Below we see that the gamma function is 
finite, and, in the next section, we see that it is continuous. Since 


lo) 
Y e “dt = J nat 
; —a 


we have (1) = 1. Below, we use the convention 0! = 1. 


= 5 (e-2 _ ¢-#0) Late a>0O, (5.1.2) 
a a 


0 


Theorem 5.1.1. The gamma _ function I(x) is positive, finite, and 
I(a+1)=2I (2) for x > 0. Moreover, [(n) = (n—1)! forn > 1. 
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To derive the first identity, use integration by parts with u = tf’, 
dv = e~*dt. Then v = —e~*, and du = xt®—'dt. Hence, we obtain the follow- 
ing equality between primitives: 


jetta = —e ft 4 x f ete tat 


Since e~*t® vanishes at t = 0 and t = oo for x > 0 fixed, and the integrands 
are positive, by the fundamental theorem, we obtain 


I(a+l= | e't'dt 
0 


=— me + a ett? ldt 
0 
=0+aI (x) =2I (2). 


Note that this identity is true whether or not I’(z) is finite. We derive ['(n) = 
(n — 1)! by induction. The statement is true for n = 1 since (1) = 1 from 
above. Assuming the statement is true for n, [(n+1) =nI(n) = n(n—-1)! = 
n!. Hence, the statement is true for all n > 1. Now we show that I'(z) is finite 
for all x > 0. Since the integral i er das i; t®—1 dt = 1/z is finite for 
x > 0, it is enough to verify integrability of e~*t?—! over (1,00). Over this 
interval, e~'t?—! increases with x; hence, ie ett? -ldt < i hae ett? -ldt < 
I(n) for any natural n > x. But we already know that '(n) = (n—1)! < co, 
hence the result. 

Because of this result, we define x! = (a+ 1) for e > —1. For example, 
in Exercise 5.4.1, we obtain (1/2)! = /7/2. 

We already know (linearity §4.4) that the integral of a finite sum of con- 
tinuous functions is the sum of their integrals. To obtain linearity for infinite 
sums, we first derive the following. 


Theorem 5.1.2 (Monotone Convergence Theorem). Let 0 < fi < 
fo < fg <... be an increasing sequence of nonnegative functions,' all defined 
on an interval (a,b). If 


Jim folt)=f@),  a<a<b, 


then 
b 


b b 
lim n(x) dx = li n(x) dx = dz. 
im, f folayae = [im fale)ae = ff ste) ae 
We caution that this result may be false when the sequence (f;,) is not 
increasing (Exercise 5.1.1). Nevertheless, one can still obtain roughly half 


this result for any sequence (f;,) of nonnegative functions (Exercise 5.1.2). 


1 These functions need not be continuous, they may be arbitrary. 
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To see this, let G,, denote the subgraph of f,, over (a,b), G the subgraph 
of f over (a,b). Then G, C Gn41 since y < fr(x) implies y < fn4i(z). 
Moreover, y < f(x) iff y < fn(x) for some n > 1; hence, G = UP, Gn. The 
result now follows from the method of exhaustion and the definition of the 
integral of a nonnegative function. 

In the next section, we show that I" is continuous; in Exercise 5.1.7, we 
show that I" is convex. Later (§5.4), we show that I’ is strictly convex. Since 
I(x) =I'(«#+1)/za for z > 0 and I'(1) = 1, it follows that ['(0+) = co. Also 
forz >n, (a) > he e 't®-! dt > i ett”! dt > (n — 1)! — 1/n; hence, 
I'(co) = oo. Putting all of this together, we conclude that I has exactly one 
global positive minimum and the graph for x > 0 is as in Figure 5.1. Later 
(§5.8), we will extend the domain of I’ to negative reals. 


Fig. 5.1 The gamma function 


Theorem 5.1.3 (Summation Under the Integral Sign: Positive Case). 
Let fn, n > 1, be a sequence of nonnegative functions on (a,b). If fn, n> 1 
are continuous, then 


: be rt = | lade. 


n=1" 


A key aspect of the proof is the use of the linearity of the integral, as 
derived in §4.4 for continuous functions.” 

For alternating series, a different version of this result is needed (next 
section). 


2 The general version is in Chapter 6. 
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To derive this, use linearity and the monotone convergence theorem 
(Theorem 5.1.2) to obtain 


[ bs 7) i / : in bs 7) die 


\| 
3 
32 
Q 

oa 
C._-_-. -8 
Mes i 
> 

8 

ed 
SS | 

2 

8 


ll 
Me 
o— 
o 
= 
& 
Q 
8 


Now we derive (5.1.1). To see this, we use the substitution x = e~' (Exer- 
cise 4.4.7), the exponential series, the previous theorem, shifting the index 
n by one, the substitution nt = s (dilation invariance), and the property 
I'(n) = (n—1)!: 


fore) co 1 
=| tee e! dt 
0 n=0 a 
oo 1 es 
= » — tre (MAN t gy 
n=0 nr: Jo 
ee. [ =ntyn-1 
= et?" dt 
e (n—1)! Jo 
oo 1 a a 7 a 
= n-” 53!" ds 
d (n—1)! 0 
~ 1 —n ~ n 
=> (n—1)!" P(n)= Son 
n=1 . n=1 


We end with a special case of the monotone convergence theorem. 


Theorem 5.1.4 (Monotone Convergence Theorem (For Series)). Let 
(Gnj), 2 > 1, be a sequence of sequences, and let (a;) be a given sequence. 
Suppose that 0 < ay; < aa; <a3j <... for allj > 1. If 


lim adn; = a; j>1 
eres Jo Jz4, 
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then 


To see this, define piecewise constant functions f,(v) = anj,j-l<a <j, 
j >1,and f(x) =aj,j -1<a< Jj, 7 > 1. Then (f,) is nonnegative on 
(0,00) and increasing to f. Now apply the monotone convergence theorem 
for integrals, and use Exercise 4.3.6. 

Using this theorem, one can derive an analog of summation under the 
integral sign involving series (“summation under the summation sign” ) rather 
than integrals. But we already did this in §1.7. 


Exercises 


5.1.1. Find a sequence f; > fo > f3 > --- > 0 of nonnegative functions, 
such that f,(x) — 0 for all  € R, but f° f(x) dx = co for all n > 1. 
This shows that the monotone convergence theorem is false for decreasing 
sequences. 


5.1.2. (Fatou’s lemma) Let f,,, n > 1, be nonnegative functions, all defined 
on (a,b), and suppose that f,(x) > f(a) for all x in (a,b). Then the lower 
limit of the sequence ( re fn(£) dx) is greater or equal to i f(a) da, 


n—->co 


b b 
/ f(x) dx < lim int [ fn(x) da. 
(For each x, let (gn(x)) equal the lower sequence (§1.5) of the sequence (f;,(x)) 
and apply the monotone convergence theorem.) 


5.1.3. Let fo(x) = 1-2? for |z| < 1 and fo(x) = 0 for |z| > 1, and let f,(x) = 
fo(a — n) for -co < & < o and n > 1. Compute f(x) = limn x0 fr(2), 
—0o <a <oo, [™ fr(x)dz,n>1, and [%. f(x) dx (Figure 5.2). Conclude 
that, for this example, the inequality in Fatou’s lemma is strict. 


fo fn 


Fig. 5.2 Exercise 5.1.3 
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5.1.4. Show that 
n t n 
I(x) = lim (1 - *) fo - ei, 
0 n 


(Use Exercise 3.2.4.) 


5.1.5. Use substitution t = ns, and integrate by parts to get 


. oN nn! 
(225) 4 gees > 0, 
| ( ~) xa(a+1)...(¢+n) . 


for n > 1. Conclude that 


n*n! 


T(r) = 1 


im ————_——_—__., x > 0, 
nZo @(a+1)...(a+n) 


and 
x 


nn! 


lim —— i. 
afore Daan)? 


a!l= 


(5.1.3) 


(5.1.4) 


5.1.6. We say that a function f : (a,b) > R* is log-convez if log f is convex 
(§3.3). Show that the right side of (5.1.3) is log-convex on (0, co). Suppose 
that fr : (a,b) + R, n> 1, is a sequence of convex functions, and f,(a) > 
f(z), as n 7 oo, for all x in (a,b). Show that f is convex on (a,b). Conclude 


that the gamma function is log-convex on (0, 00). 


5.1.7. Show that I’ is convex on (0,00). (Consider = exp(log I’) and use 


Exercise 3.3.3.) 
5.1.8. Let s,(t) denote the nth partial sum of 


1 


=p ee ee eas t> 0. 
e? — 1 


Use s,,(t) to derive 


where ¢(x) = 0, n-*, a > 1. 
5.1.9. Let _ 
wOH=yer", 250: 
n=1 


Show that 
| w(te®/2-ldt = 2-7 P(a/2)e(a), 2 >1, 
0 


where ¢ is as in the previous exercise. 
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5.1.10. Show that Ei #1 (—logt)" “1 dt = I'(n)/z” for z > 0 and n > 1 
(Exercise 4.4.7). 


5.1.11. Show that 


mh 


ss Lr 
| @ "| log tl * dt oa +I(2+n-—1) 
0 x 
for z > 0 and n > 1. (Break up the integral into i + f°° and use logt < t 
for t > 1.) 


5.1.12. Use the monotone convergence theorem for series to compute ¢(1+) 
and (0+). 


5.1.13. With 7(t) = t/(1— e~*), show that 


(Compare with Exercise 5.1.8.) 


5.1.14. Use the monotone convergence theorem to derive continuity at the 
endpoints (§4.3): If f : (a,b) > R is nonnegative and a, \, a, by, 7 b, then 


Son f(a) da + f° f(x) de. 
5.1.15. Show that 


s+1 s 
exp (/ log I'(x) ax) = constant x (=) ; s>0. 
7 € 


The left side is the geometric mean of I over the interval (s,s +1). The 
constant is evaluated in Exercise 5.5.9. 


5.2 The Number 7 


In this section, we discuss several formulas for the irrational (Exercise 1.3.18) 
number 7, namely: 


e The Leibnitz series 
1 1 1 
=l1-+=-++-—-<+4... 5.2.1 
ate ee (5.2.1) 
e The Wallis product 


(5.2.2) 
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e The Vieta formula 


2 = ft Aa 
x V2 V2°'2V2 


; (5.2.3) 
e the continued fraction expansion 
7 1 
2 Se 5.2.4 
- — (5.24) 
aa 
2+ : 
2+ = 
2+ = 
2+.. 
e The Bailey-Borwein—Plouffe series 
1 4 2 1 1 
oe a pence pence eg cee ee 5.2.5 
" Dig (S Bn +4 Bnt5 =) oe) 


Along the way, we will meet the dominated convergence theorem, and we also 
compute the Laplace transform of the Bessel function of order zero. 

It is one thing to derive these remarkable formulas and quite another to 
discover them. We begin by rederiving the Leibnitz series for 7 by an alternate 
method to that in §3.7. 

Start with the power series expansion 


1 


We seek to integrate (5.2.6), term by term, as in §4.4. Since arctanl = 7/4 
and arctanz is a primitive of 1/(1 + x7), we seek to integrate (5.2.6) over 
the interval (0,1). However, since the radius of convergence of (5.2.6) is 1, 
the result in §4.4 is not applicable. On the other hand, the theorem in 85.1 
allows us to integrate, term by term, any series of nonnegative functions. 
Since (5.2.6) is alternating, again this is not applicable. 
If we let s,(#) denote the nth partial sum in (5.2.6), then by the Leibnitz 
test (§1.7), 
0<s,(x) <1, 0<a<1,n>1. (5.2.7) 


It turns out that (5.2.7) allows us to integrate (5.2.6) over the interval (0,1), 
term by term. This is captured in the following theorem. 


Theorem 5.2.1 (Dominated Convergence Theorem). Let f,, n > 1, 
be a sequence of functions defined on (a,b). Suppose there is a function g 
integrable on (a,b) satisfying |fn(x)| < g(x) for alla in (a,b) and alln > 1. If 
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lim fn(x) = f(x), a<a<b, 
noo 


then f and fr, n > 1, are integrable on (a,b). If g, f, and fn, n > 1, are 
continuous, then 


b b b 
lim / fn(x) dx = / lim fn(x) da = / f(a) da. (5.2.8) 
nfo Ja qa 700 2 

Note that (5.2.8) says we can switch the limit and the integral, exactly 
as in the monotone convergence theorem. This theorem takes its name from 
the hypothesis | f,(x)| < g(x), a < x < b, which is read fp is dominated by g 
over (a,b). The point of this hypothesis is the existence of a single integrable 
g that dominates all the f,,’s. 

The two results, the monotone convergence theorem and the dominated 
convergence theorem, are used throughout analysis to justify the interchange 
of integrals and limits. Which theorem is applied when depends on which 
hypothesis is applicable to the problem at hand. When trigonometric or more 
general oscillatory functions are involved, the monotone convergence theorem 
is not applicable. In these cases, it is the dominated convergence theorem that 
saves the day. 

When (a,b) = (0,00) and the functions f,, n > 1, f, g, are piecewise 
constant, the dominated convergence theorem reduces to a theorem about 
series, which we discuss at the end of the section. Also one can allow the 
interval (ap, bp) to vary with n > 1 (Exercise 5.2.15). We defer the derivation 
of the dominated convergence theorem to the end of the section. 

Going back to the derivation of (5.2.1), since the nth partial sum s,, con- 
verges to f(x) = 1/(1+27) and |s,,(x)| < 1 by (5.2.7), we can choose g(x) = 1 
which is integrable on (0,1). Hence, applying the fundamental theorem and 
the dominated convergence theorem yields 


1 1f< 
1 
arctan 1 — arctan0 = i ia dx = | y are dx 


T 
4 


N 1 co 1 
= lim —1)” x” dx = -1 f x” dx 
din ey 0" f 
ae \r ee ee oe | 
et n+1 a ne 


This completes the derivation of (5.2.1). 
The idea behind this derivation of (5.2.1) can be carried out more 
generally. 
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Theorem 5.2.2 (Summation Under the Integral Sign: Alternating 
Case). Let fn, n > 1, be a decreasing sequence of nonnegative functions 
on (a,b), and suppose that fi is integrable on (a,b). Then fr, n > 1, and 
yr (- 1)" "fn are integrable on (a,b). If fn, n> 1, and ~~, (-1)""'fr 
are continuous, then 


b [ 2 oo b 
/ pxetmeaa) ae = (1) | f(a) de. 


n=1 n=1 


To derive this, we need only note that the nth partial sum s,, is nonnegative 
and no greater than g = f,, which is integrable. Hence, we may apply the 
dominated convergence theorem, as above, to the sequence (s,,) of partial 
sums. 

For example, using this theorem to integrate the geometric series 1/(1 + 
z)=1—2+27-—23+... over (0,1), we obtain 


1 1 
log2=1-—=-4- 
og 4° 3 


=e 
gto 


Now we discuss the general case. 


Theorem 5.2.3 (Summation Under the Integral Sign: Absolute 
Case). Let fr, > 1, be a sequence of functions on (a,b), and suppose that 
there is a function g integrable on (a,b) and satisfying \--~, |fn(x)| < 9(z) 
for all x in (a,b). Then fn, n > 1, and >>, fn are integrable. If g, fn; 
n> 1, and Sean! fn are continuous, then 


[ p> 7) ee med eGhas, 


To derive this, we need only note that |s,(x)| < |fi(x)| +--+ +|fn(z)| < 
g(x), which is integrable. Hence, we may apply the dominated convergence 
theorem, as above, to the sequence (s,,) of partial sums. 

The Bessel function of order zero is defined by 


Po 2n 
n wv 
n=0 


To check the convergence, rewrite the series using Exercise 3.5.10 obtaining 


y= So (“) ET, v0 <<. 
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Now use the definition of (”) (§3.5) to check the inequality lee) <1 for 
all n > 0. Hence, 


—1/2 gen 
n /} (2n)! 
This shows that the series Jo converges absolutely for all x real. Since Jo 


is a convergent power series, Jo is a smooth function on R. We wish to use 
summation under the integral sign to obtain 


Co 


[Jo(x)| < So 


n=0 


= Iz] 
< S- Gn)! < el, co <au<oo. (5.2.9) 


n=0 


a 1 
e ** Jo(x) dx = ——_, s>l. 5.2.10 
I NEN ae vee 


The left side of (5.2.10), by definition, is the Laplace transform of Jo. 
Thus, (5.2.10) exhibits the Laplace transform of the Bessel function Jo. In 
Exercise 5.2.2, you are asked to derive the Laplace transform of sin a/a. 

To obtain (5.2.10), fix s > 1, and set fn(a) = e7**(~1/*)a2"/(2n)!, n > 0. 
Then by (5.2.9), we may apply summation under the integral sign with g(a) = 


—Ssz Ar 


e **e”, « > 0, which is positive, continuous, and integrable (since s > 1). 


Hence, 
—1/2 1 i 
/ oa | e 8% 2M der, 
n } (2n)! Jo 


Inserting the substitution x = t/s, dx = dt/s, and recalling Newton’s gener- 
alization of the binomial theorem (§3.5) yields 


ag —sx — =1/2 t “ —t42n 
| e€ Jn(e) de = > ( A laa | e€ 42 dt 


f- e** Jo(x) dx = 3 


n=0 


n=0 
a S —1/2\ P(2n+1) 
_ a af eale 
A 3 —1/2\ /1\" 
os oe s? 
A 1 4 
8 (Vf1+ sy? Vit? 
This establishes (5.2.10). 
Now let (at) 
°° sin(at 
F(z) = t — F 
(x) i tip % 00 <4 <0O 


We use the dominated convergence theorem to show that F' is continuous 
on R. To show this, fix x € R, and let x, — 2; we have to show that 
F(an) > F(z). Set fr(t) = sin(ant)/(1 + #7), f(t) = sin(at)/(1 + t?), and 
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g(t) = 1/(1+#?). Then f,(t), n > 1, f(t), and g(t) are continuous; all the 
fn(t)’s are dominated by g(t) over (0, 300), Ff n(t) + f(t) for all t > 0, and 
g(t) is integrable over (0,00) since i g(t) dt = 2/2. Hence, the theorem 
applies, and 


pe aa 
co at 
= i fing, ED 
0 noo 1+ t2 
°° sin(xt) 
| ia (2) 


This establishes the continuity of F’. 

Similarly, one can establish the continuity of the gamma function on 
(0,00). To this end, choose 0 < a <  < b < oo, and let r, — x with 
a <n <b. We have to show I'(x,) > I'(x). Now f,(t) = e~'t*"—! satisfies 


ee Leta, 
full s ae O<e <1, 
If we call the right side of this inequality g(t), we see that f,(t), n > 1, are 
all dominated by g(t) over (0,00). Moreover, g(t) is continuous (especially at 
t = 1) and integrable over (0,00), since f>~ g(t) dt < (a) + I'(b). Also the 
functions f,(t), n > 1, and f(t) = e~'t®~' are continuous, and f,,(t) > f(t) 
for all t > 0. Thus, the dominated convergence theorem applies. Hence, 


oe) 
lim ['(¢,) = lim etn dt 
noo noo 0 
oo 
=|} lim e7't?"—1d¢t 
0 noo 


= i et? ld = P(x). 
0) 


Hence, I is continuous on (a, b). Since 0 < a < bare arbitrary, I’ is continuous 
n (0,00). 

Next, we derive Wallis’ product (5.2.2). Begin with integrating by parts 
to obtain 


1 =a 
[svt cde = —=sin"-! xcosa + —— [ovr trae, n> 2. (5.2.11) 
Tr nm 


Evaluating at 0 and 7/2 yields 


nm /2 —] am /2 
| sin” ede = ~ | sin”? x dz, n> 2. 
0 0 


n 
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Since i sin? x dx = 1/2 and ie sin! x dx = 1, by the last equation and 
induction, 


m2 (2n —1)-(2n—3)----- 1a 
lon = in?” ¢ da = 
? po Seat oO 


and 


ie Qn- (Qn — 2). 2 
Tons1 = in2?t! » dy = —— “eT 
Qnt1 | sin x dx Ons Gn is-or 3° 


for n > 1. Since 0 < sina < 1 on (0,7/2), the integrals [,, are decreasing in 
n. But, by the formula for J, with n odd, 


Qn—1 1 
1< 41-2, wi 
Qn+1 2 
Thus ie ie L 
1< 2 cg SP cit me > I, 
Lon+1 Gag 2n 


or Ign/Ton41 3 1, as n 7 oo. Since 


Ton _ Qn+1)-(2n—1)-Qn—1)-----3-3-1 1 


ike Sip Dns — 2) reeds AsO “3 


we obtain (5.2.2). 
A derivation of Vieta’s formula (5.2.3) starts with the identity 


sin 6 0 0 0 
2” sin(0/2") = COs (5) COs (=) »..COS (=) (5.2.12) 


which follows by multiplying both sides by sin(@/2”) and using the double- 
angle formula sin(2x) = 2sinazcosx repeatedly. Now insert in (5.2.12) 0 = 
a/2, and use, repeatedly, the formula cos(@/2) = ,/(1 + cos @)/2. This yields 


> ae 


7 sin sin(a/2"+1) 


where the last (nth) factor involves n square roots. Letting n 7 oo yields 
(5.2.3) since sina/a > las x0. 
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To derive the continued fraction expansion (5.2.4), first, we must under- 
stand what it means. To this end, introduce the convergents 


Cn = <<. (5.2.13) 
1+ 
2+ 


49 
(2n — 1)? 
ie ie ce Se 
- 2 


Then we take (5.2.4) to mean that the sequence (c,) converges to 7/4. To 
derive this, it is enough to show that c, equals the nth partial sum 


oe ve. ui 
i ees ~ On +1 


of the Leibnitz series (5.2.1) for all n > 1. 
Given reals a1,...,@n, let 


8% = a1 + G1G2 + a1a243 +++: +4142...Gp. 


Then s* = s*(aj,...,@n) is a function of the n variables a1,...,@,. Later, 
we will make a judicious choice of a,,...,a,. Note that 
* * 
ay + 018% (a2, -.-)4n42) = 8% 41 (01,02, ..-)On43)- 


Let f(x,y) =2/(1+2—y), and let 


* 0 ay 
ci(a1) = f(a, 0) = (ay 
* ay 
c5(a1, 02) = (ax, f(a2,0)) = —*_, 
2 
1 = 
en 1 + ag 
c3(a1, a2,43) = f(a1, f (a2, f(@3,0))) = == =. 
1 +a, — z 
a3 
1 = 
- oe 1 + a3 
and so on. More systematically, define cf},(a1,...,@n,) inductively by setting 
ci(a1) = ai/(1+ a1) and 
Cay OisOaye sag Until = es n> 1. 


~ L+a1 — ch (a2,...,@n41)’ 
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We claim that: 


* 


2k Sn 

Cc, = —— n> 1, 5.2.14 

pee oe (5.2.14) 
and we verify this by induction. Here ch = c*(aj1,...,@n) and s* = 
s*(d1,.--,;@n). Clearly c] = s}/(1+ sj) since s} = a,. Now assume cy = 


s*/(1+s*). Replacing a1,...,@n by a2,...,@n41 yields 
Ch (d2,---;An41) = $5, (d2,.--,Gn41)/[1 + 5% (@2,...,@n41)]- 


Then 
Cn41(@1,42)---, Anti) = Mae Gciadl 
ay 
8* (A9,---,Qn41) 
1+ s*(ao,..-,Qn41) 
8744 (G19 02, +. On41) 
7 Lh 4 (ay Gay eis4 Meet) 


l+a,—- 


Thus, ch ., = $%,,/(1+ 87,41). Hence, by induction, the claim is true. 
Solving for 1+ s* in (5.2.14) and multiplying by ao yield 


ao 


* 
ao + aos, = ia 7 
nm 


We arrive at Euler’s continued fraction formula. 


Theorem 5.2.4. For ag, @1,02,..-,@n real, 
ao 
Ag+aopai+: + :+d9Q1...An = 
ay 
_ 
a 
l+a,—- a 
1l+ag— 
An-1 
Gn 
1+an_-1—- 
n—-1 1+an 
Now choose 
l 3 5 2n-—1 
an = = —-7,42 = -—T,43 = —-T,---, An = —- . 
0 > Ay 3 2 5 3 7 n n+l 
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Then the left side of Euler’s continued fraction formula is s,, and the right 
side is 


(5.2.15) 


(2n — 1)/(2n +1) 
1—(Qn—1/Qn+1) 


Now multiply the top and bottom of the first fraction by 1, and then the top 
and bottom of the second fraction by 3, and then the top and bottom of the 
third fraction by 5, and so on. Then (5.2.15) becomes cp. Since s, — 7/4, 
we conclude that c, + 1/4. This completes the derivation of (5.2.4). 

The series (5.2.5) is remarkable not only because of its rapid convergence, 
but because it can be used to compute specific digits in the hexadecimal (base 
16, see §1.6) expansion of 7, without computing all previous digits ([5]). 

To obtain (5.2.5), check that 


4/2 — 823 —4/2x4- 82° 4/2 — 4 Aa 


1-28 a2—VJ2e+1 1-2 


using 2° — 1 = (2* — 1)(a* +1) and 24 +1 = (2? + V2e+ 1)(2? — 2 + 1). 
Hence, (Exercises 3.7.15 and 3.7.16), 


Ay/2 — 82 — 4/224 — 82° 
——___— dx 


= darctan(V2x — 1) — 2log(a? — V2x + 1) + 2log(1 — 2”). 


Evaluating at 0 and 1/ /2 yields 


2.1 
ie (5.2.16) 


a 4/2 — 823 — 4/224 — 825 
r= A A a 
0 


To see the equivalence of (5.2.16) and (5.2.5), note that 


a ght ee k-1+8n 


n=0 


Co pl / V2 
= >| gh-1+8n dx 
0 


n=0 
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1 1 
fee anal 


Now use this with k equal 1, 4, 5, and 6, and insert the resulting four series 
n (5.2.16). You obtain (5.2.5). 

To derive the dominated convergence theorem, we will need Fatou’s lemma 
which is Exercise 5.1.2. This states that, for any sequence f,, : (a,b) ~ R 
n > 1, of nonnegative functions satisfying f,(”) > f(«) for all x in (a,b), the 
lower limit of the sequence i fn(x) dx) is greater or equal to r f(x) dx 
Although shelved as an exercise, we caution the reader that Fatou’s lemma 
is so frequently useful that it rivals the monotone convergence theorem and 
the dominated convergence theorem in importance. 

Let I* and J, denote the upper and lower limits of the sequence (I;,) = 


as fr(x) dx), and let I = - f(a) dx. It is enough to show that 


Leu Si, (5.2.17) 


since this implies the convergence of (I,,) to I. 

If fn, n > 1, are as given, then +f,(a2) < g(a). Hence, g(x) — fn(x) 
and g(x) + fn(x), n > 1, are nonnegative and converge to g(x) — f(x) and 
g(x) + f(x), respectively, for all « in (a,b). 

Apply Fatou’s lemma to the sequence (g + f,). Then 


b 


b b 
f oejart r= f lola) + salar < timint f [g(e) + fa(o)] ae 
a a ; a 
=) g(x) da + lim inf =i g(x) dx +I,; 


hence, I < J, which is half of (5.2.17). Here we are justified in using linearity 
(Theorem 4.4.5) since the functions f, g, and fn, n > 1, are continuous. 
Now apply Fatou’s lemma to the sequence (g — f,). Then 


| ‘g(@)ae—T = i; ‘[o(2) — f(@)] dn < tir nt i Weed 
= [ 92) de—timsep tn =f g(e)de— 1: 


n> oco 


hence, [ > I* which is the other half of (5.2.17). 
A useful consequence of the dominated convergence theorem is the 
following. 
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Theorem 5.2.5 (Continuity Under the Integral Sign). Let f : (a,b) x 
(c,d) + R be such that f(-,t) is continuous for all c < t < d. Suppose there 
is an integrable g : (c,d) — R satisfying |f(x,t)| < g(t) fora <a <b and 
c<t<d. Then f(x,-) is integrable on (c,d) fora <a <b and 


d 
F(x) = i f(a, t) dt, a<a<b, (5.2.18) 


is well defined. If g and f(x,-),a<a <b, are continuous, then F is contin- 
uous. 


Note that the domination hypothesis guarantees that F is well defined. 
To establish this, fix x in (a,b) and let v, — x. We have to show that 
Pay) F(a) Let halt) =f Gast Cat <a nS load FO) =F (at); 
c<t<d. Then k,,(¢) and k(t) are continuous on (c,d) and kp (t) > k(t) for 
c<t<d. By the domination hypothesis, |k,(t)| < g(t). Thus, the dominated 
convergence theorem applies, and 


Flea) =f ka(tdt + [ae dt = Fle) 


For example, the continuity of the gamma function on (a,b),0<a<b<«, 
follows by choosing, as in the beginning of the section, 


(t) = etl, 1<t<o, 
INE Ver tck. Ge pet. 


Moreover, the separate continuity of f in (a,t) is immediate because f is the 
product of two continuous functions, one of x and one of t. Because continuity 
is established in the same manner in all our examples below, we will usually 
omit this step. 

In fact, continuity under the integral sign is nothing but a packaging of 
the derivation of continuity of I’, presented earlier. 

Let us go back to the statement of the dominated convergence theorem. 
When (a, b) = (0,00) and the functions (f,), f, and g are piecewise constant, 
the dominated convergence theorem reduces to the following. 


Theorem 5.2.6 (Dominated Convergence Theorem (for Series). Let 
(Qnj),n > 1, be a sequence of sequences, and let (a;) be a given sequence. Also 
suppose that there is a convergent positive series ae g; satisfying |anj| < 9; 
for all j >1 andn > 1. If 


lim @nj = a5, jol 


2 
noo 
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then 


To see this, for 7 -1 < a < j set f,(@) = anj, n > 1, f(x) = aj, and 
g(x) = 93,7 =1,2,..., and use Exercise 4.3.6. 
Let us use the dominated convergence theorem for series to show? 


ae | it 4 
lim (1 Eo pt )at Sega T he (5.2.19) 


which sums to log2. For this, by the mean value theorem, (27 — 1)~* — 
(2j)~* = 2(2j7 — t)~*~' for some 0 < t < 1. Hence, (2j — 1)~* — (2j)~* < 
2(27 — 1)~3/2 when 1/2 < x < 2. Now let en > 1 with 1/2 < 2p < 2, and set 
nj = (25 — 1B — (25), ay = (25 — 1)? — (291, gy = 2(2j - 1-3? 
for 7 > 1,n > 1. Then a,; — a; and |an;| < g; for all 7 > 1. Hence, the 
theorem applies, and, since the sequence (x,,) is arbitrary, we obtain (5.2.19). 
Note how, here, we are not choosing ay; as the individual terms but as pairs 
of terms, producing an absolutely convergent series out of a conditionally 
convergent one (cf. the Dirichlet test (§1.7)). 

Just as we used the dominated convergence theorem for integrals to obtain 
continuity under the integral sign, we can use the theorem for series to obtain 
the following. 


Theorem 5.2.7 (Continuity Under the Summation Sign). Let (fn) be 
a sequence of continuous functions defined on (a,b), and suppose that there 
is a convergent positive series >>, gn of numbers satisfying |fn(x)| < gn 
forn>1l1anda<«a<b. If 


F@)=) f.@), a<2<b, 


then F’ : (a,b) > R is continuous. 


For example, 


is continuous on (a,00o) for a > 1, since 1/n” < 1/n® for x > a and YO gn = 
>> 1/n* converges. Since a > 1 is arbitrary, ¢ is continuous on (1, 00). 


3 This series converges for x > 0 by the Leibnitz test. 
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Exercises 


5.2.1. Derive 


Thus, 7 4 22/7. 


5.2.2. Use 
sin x eat 
Co 3! 5! 
to derive the Laplace transform 


ee sin x 1 
| e.: dx = arctan (=) . s>l. 
0 © 8 


5.2.3. Suppose that f,,n > 1, f, and g are as in the dominated convergence 
theorem. Show that f is integrable over (a, b). 


5.2.4. Show that x? Jf (x) + xJj(x) + x7 Jo(x) = 0 for all x. 
5.2.5. Derive (5.2.11) by integrating by parts. 
5.2.6. Show that 

sin x/a% = cos(a/2) cos(x/4) cos(a#/8).... 


5.2.7. This is an example where switching the integral and the series changes 
the answer. Show that 


Eg —1)” oo oo OS far 
oa Ae ) | e "a" dx # / e* ye er a dx. 
n=O n. 0 0 n=0 we 

5.2.8. Show that the Fourier transform (compare with Exercise 5.1.8) 


°° sin(sx) - 8 
: ao a —oO <S5< OW. 
1 


5.2.9. Show that 


°° sinh(sa) 8 
See ae Se, <1. 
[ a = n? — s? Is! 


5.2.10. Let s,, > 0, be the nth partial sum of the Bailey-Borwein—Plouffe 
series. Show 


Sn <7 < Syn 4+ = Sh), n> 0. 


1 
A(n + 1)216"*1 
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5.2.11. With s, and S, as in the previous exercise, n > 0, write computer 
code to show 


85 = 40413742330349316707 /12864093722915635200, 
S's = 62075508227595320986877 /19759247958398415667200. 


Following Exercise 1.3.18 show the continued fraction expansions of s5 and 


Ss both start out as 3 + [7,15,1,292,...]. Conclude (Exercise 1.3.23) 
1 
m=34+ i 
7+ i 
15+ 1 
‘+ 3994... 


leading to the convergents 22/7 = 3 + [7] and 355/113 = 3 + [7, 15, 1]. 


5.2.12. Show that the vth Bessel function 


1 “is 

J) (x) = -| cos(vt — a sint) dt, —o <4 <M, 
T Jo 

is continuous. Here v is any real. 

5.2.13. Show that (x) = 7°, e-"’™@ x > 0, is continuous. 


5.2.14. Let fn, f,g : (a,b) > R be as in the dominated convergence theorem, 
and suppose that a, — a+ and b, — b—. Suppose that we have domination 
|fn(a)| < g(a) only on (an, b,), rn > 1. Show that 


a | " Pde | fees 


noo 
5.2.15. Show Jp in the text is the same as J, in Exercise 5.2.12 with v = 0. 


5.2.16. Use Exercise 4.4.22 to show that Euler’s constant satisfies 


y= lim fe ae fe al, 


n Zoo t t 


Use the dominated convergence theorem to conclude that 


1 —t co p-t 
= 
v= f : a— | Sat. 
0 t 1 ¢ 


(For the second part, first show 0 < [1 — (1 —t/n)"]/t < 1.) 
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5.2.17. Use Euler’s continued fraction formula to derive 


x 
arctan xz = 5 
1+ = 
(3a)? 
3-474 5 
(5x) 
it ae Reman (75 
© 
7— 5x2 + —— 
5.2.18. Use Euler’s continued fraction formula to derive 
x 
e~=1+ 
x 
= 
au 22 
ip 
af = 
4a 
4+2—-— 


5.3 Gauss’ Arithmetic-Geometric Mean 


Given a > b > 0, their arithmetic mean is given by 


po GD 
2 >) 


and their geometric mean by 
b’ = Vab. 


Since 


a =2 4" _ Yah => (va-vb) >0, (5.3.1) 


these equations transform the pair (a,b), a > b > 0, into a pair (a’,’), 
a’ > b’ > 0. Gauss discovered that iterating this transformation leads to a 
limit with striking properties. 

To begin, since a is the larger of a and b and a’ is their arithmetic 
mean, a’ < a. Similarly, since b is the smaller of a and b, b’ > b. Thus, 
b<U <a’ <a. 

Set ag = a and bo = Bb, and define the iteration 


An + bn 
an+1 = a ia (5.3.2) 
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bn4i = Vanbn, n> 0. (5.3.3) 


By the previous paragraph, for a > 6 > 0, this gives a strictly decreasing 
sequence (a,,) and a strictly increasing sequence (b,,) with all the a’s greater 
than all the b’s. Thus, both sequences converge (Figure 5.3) to finite positive 
limits a,, b* with a, > b* > 0. 


M(a,b) 


0 b bh be a2 a, a 


Fig. 5.3 The AGM iteration 


Letting n 7 oo in (5.3.2), we see that a. = (a,+b*)/2 which yields a, = b*. 
Thus, both sequences converge to a common limit, the arithmetic—geometric 
mean (AGM) of (a,b), which we denote 


M(a,b) = lim ay, = lim bp. 
noo noo 
If (a},), (b},) are the sequences associated with a’ = ta and b’ = tb, then 
from (5.3.2) and (5.3.3), a}, = tan, and b/, = tb,, n > 1, t > 0. This implies 
that M is homogeneous in (a,b), 


M(ta,tb)=t-M(a,b),  t>0. 


The convergence of the sequences (a,), (bn) to the real M(a,b) is quadratic 
in the following sense. The differences a,, — M(a,b) and M(a,b) — by, are no 
more than 2c,41, where 


no bn 
C= so mE. (5.3.4) 
By (5.3.1), 
1 7 Ci, 1 
O0<Cn4i = a Co —Vv Pot) = — 3 8 no 
e, QAn-1 a bn—1) 
Iterating the last inequality yields 
gn 
a—b 


This shows that each additional iteration roughly doubles the number-of- 
decimal-place agreement, at least if (a — b)/8b < 1. For a general pair 
(a,b), eventually, (an — bn )/8bn < 1. After this point, we have the rapid 
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convergence (5.3.5). In (5.3.15) below, we improve (5.3.5) from an inequality 
to an asymptotic equality. In Exercise 5.7.5, we further improve this to an 
actual equality. 

For future reference, note that 


= +2, n>. 
The following remarkable formula is due to Gauss. 


Theorem 5.3.1. Fora >0 andb>0, 


-2f" ee. ee 
ery Perr cos? 6 + b? sin? 


Gauss was initially guided to this formula by noting both sides agreed 
to eleven decimal places when (a,b) = (1, 1/2). We compute M(1,1/V2) 
explicitly in the next section (see (5.4.5)). 

The derivation is best understood within the context of complex numbers.* 
We present this proof cosmetically altered to remain within the real domain. 

To derive the formula, let 


ned ‘i dd 2 | do 
™Jo v/a2cos?6+b?sin?@ 7 Jo a? cos? 6 + b2 sin? 0 
Note I(a,b) = I(b,a) and I(m,m) = 1/m. The main step is to establish 


invariance of J under the AGM iteration 


I(a,b) =I (: 7 a vii) = I(a',b’). (5.3.6) 


The result follows from (5.3.6) by iteration I(an,b,) = I(a,b) followed by 
passing to the limit n > oo (Exercise 5.3.2). Since (an, bn) > (m,m), m= 
M(a, b), the result follows. 

The invariance (5.3.6) is established by the substitution 


acos? @ — bsin? *) 


6’ = arccos See 
acos? 6 + bsin* 0 


This map 0 + 6’ is smooth on (0, 7/2) and is a continuous bijection of [0, 7/2] 
onto [0, 7] (Exercise 5.3.3). 

To compute the derivative d6’/d0, we use the fact (§3.6) that (x,y) is on 
the unit circle iff (x, y) = (cos@,sin @) and define 


, ax* — by? ,  2vabsry 
yo S — SSS 
ax? + by?’ Yan + by? 


Then (x,y) = (cos0, sin @) on [0,/2] iff (2’, y’) = (cos 6’, sin 6’) on [0, 7]. 


4 Via the unit circle map 2’ + iy’ = (faz + ivby)/(/ax — iv by). 
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Now let \ = 1/(ax? + by”). Then 
ady — ydx = cos@ cos 6 d0 — sin 0(— sin 0) dO = (cos? 6 + sin? 0)d0 = d0 
and (a’,y’) = (A(ax? — by”), 2\b/zy) so 
dX 


ae = rl +X (2axrdx — 2bydy) , 
dy’ = ay + A2b' (ady + ydx) ; 
hence, 
dO! = x' dy’ — y'dax! = 2b'd? (ax? + by?) (ady — ydx) = 2b'dd6. 
Now 


bg! 4 g/2y? = A707 (aa? — by”)? + \7(a + b)787 x 2 y" 
= 2p? (cay? 4 bay? 4 by! + abet) 
_ 7b? (a? + y*)\(a2 a? + by"). 


Dividing the last two equations yields 


de’ do 


——__—_ = OO ————— 
\/ b? cos? 6’ + a/? sin? 6’ \/ a? cos? 6 + b? sin? 6 


which implies 


[ dd! af, dd 
0 /b cos? 6’ + a’? sin? 6’ 0 a? cos? 6 + b? sin? 6 


Thus, I(a’, b’) = 1(b’, a’) = I(a, b). 

Next, we look at the behavior of M(1,x), as x + 0+. When ap = 1 and 
bo = 0, the arithmetic-geometric iteration yields a, = 2~” and b, = 0 for 
all n > 1. Hence, M(1,0) = 0. This leads us to believe that M(1,x) > 0, as 
x — 0+, or, what is the same, 1/M(1, a) > 00, as > 0+. Exactly at what 
speed this happens leads us to another formula for 7. 


Theorem 5.3.2. 


Jim, rae — log (=) =r) (5.3.7) 


To derive this, from Exercise 5.3.4, 


7/2 | 


M(1,2) -{ W esacera (x? + #2) 


(5.3.8) 
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By Exercise 5.3.7, this equals 


_m/2 _y a (5.3.9) 
M(1,) 0 (1+ #)(1 + (at)?) . 


Now call the right side of (5.3.9) I(x). Thus, the result will follow if we show 
that 


fins 1) 6g (=) 0. (5.3.10) 


«20+ 


To derive (5.3.10), note that 


1/ve gt 1/Ja 
I(x) =2 = Dip t+V1+?)| 
(0) =2 [Ta = 2 los ( YI, 
1 
= Blog (1+ VEFT) + tog (=) 


1 1 4 
= 2log (5 + 5veF1) + log (=) 


and, so log(4/x) — J(x) > 0 as « + 0+. Thus, it is enough to show that 


Pn [I(x) — J(x)] = 0. (5.3.11) 
But, for zt > 0, 
1 1 xt 
0< 1-— ———. < 1 —- — = — <itt. 
1 + (at)? 1l+at 1+at 


So 


2 pak 
e Wee 
1/fe 
= 21+ P| = 2./a£(1+ 2) — 2a, 
0 


which clearly goes to zero as x — 0+. 
Our next topic is the functional equation. First and foremost, since the 
AGM limit starting from (ao, bo) is the same as that starting from (a1, b1), 


M(a,b) =M (= va) ; 
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Below, given 0 < x <1, we let 2’ = /1— x? be the complementary variable. 
For example, (2’)) = a and k = 2\/a/(1+) implies k’ = (1 — x)/(14+ 2) 


since 
2/e\? tany 
+ =1 
1l+2 1l+zaz 


Also with an, bn, Cn, n > 1, as above, (b,/an)! = Cn/an. The functional 
equation we are after is best expressed in terms of the function 

M(1, x) 
M(1,2’)’ 


Q(x) = 0<a<1. 
Note that Q(2’) = 1/Q(z). 


Theorem 5.3.3 (AGM Functional Equation). 


1-g' 
=2 —— 1. 3.12 
Qe) =20(7=5),  o<e< (5.3.12) 
To see this, note that M(1+4+ 2’,1—2’) = M(1,z2). So 
! ! ! Lo 


Here we used homogeneity of MM. On the other hand, 


HPs — 


M(1,2") = M[(1+2')/2, v2] = 


/ 1-7! / 
me acd ae es 
2 1+qa2' 


Here again, we used homogeneity of M. Dividing (5.3.13) by (5.3.14), the 
result follows. 

If (a,) and (b,) are positive sequences, we say that (ay) and (bp) are 
asymptotically equal, and we write an ~ by, as n 7 00, if dn/bn 4 1, asn 7 
oo. Note that (a,,) and (b,,) are asymptotically equal iff log(a,,)—log(b,) > 0. 
Now we combine the last two results to obtain the following improvement 
of (5.3.5). 


2 


(5.3.14) 


Theorem 5.3.4. Leta > b > 0, and let an, bn, n > 1, be as in (5.3.2),(5.3.3). 
Then 
an — bn ~ 8M(a,b)-¢? , nf oX, (5.3.15) 


where q = e~™@(b/4) , 


To derive this, use (5.3.4) and a, — M/(a,b) to check that (5.3.15) is 
equivalent to 


” 
ae, (7 7a1b/2)/2) . on Zoo. (5.3.16) 


4an 
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Now let tp, = Cn /dn. Then 2, > 0, as n 7 oo. By taking the log of (5.3.16), 
it is enough to show that 


log (=) _=Q (2) +0, weros (5.3.17) 
Le, 2 a 
By (5.3.7), (5.3.17) is implied by 
a a |) ae Z (5.3.18) 
M2.) 7 , n 7. 3. 


By Exercise 5.1.6, (5.3.18) is implied by 


=o 


In fact, we will show that the left side of (5.3.19) is zero for all n > 1. To 
this end, since ¢n/an = (bn/an)’, 


—2"Q (2) > 0, n 7. (5.3.19) 


Cn+1 an — bn 


Qn+1 An + bn 


_1= On/an) 
1+ (bn/an) 

_ 1 — (€n/an)’ 
1+ (Cn/an)’” 


Hence, by the functional equation, 


Q (Cn41/Gn+41) =Q (eer) = 50 (Gu/ an). 


Iterating this down to n = 1, we obtain 
Q(Cn/an) = 2 * Oey 4 / dey) = 2-70 Cn—2/An—2) se 
1. = 2°-DQ(ey Jax) = 2-"Q((b/ay’) = 2-"/Q(b/a), n> 1. 


This shows that 1/Q(#,) = 2”Q(b/a). 
Dividing by 2” in (5.3.17), we obtain 


An wT b 
lim 27" log (= ) = =Q( = 5.3.20 
bee os () sa(2), ( ) 


which we will need in 85.7. Note that we have discarded the 4 since 2~” log 4 > 
0. In the exercises below, the AGM is generalized from two variables (a, b) to 
d variables (#1,..., a). 
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Exercises 


5.3.1. Fix 0 <t< 1. For 0 <b <a, let 
=(1-t)a+tb, B= oe", 


Define an iteration by an41 = a 
to a common limit. 


bn4i = Ui. Show that (a,), (bn) converge 


n? 


5.3.2. Use the dominated convergence theorem to show that an — a and 
bn > b, a> b> 0, implies I(an, bn) > I(a, b). 


5.3.3. Show that the map 0’ = G(@) is a continuous bijection from [0, 7/2] 
to [0, 7]. 


5.3.4. Use t = btan@ to show 


=i f rae Va +(e +E) 


5.3.5. Show that 


O0<a<l. 


Wee 1-2) yj" —— 


5.3.6. Show that 


2 
16°” 2n 1 
Ween >> & Je pe ee 


by using the binomial theorem to expand the square root and integrating 
term by term. 


5.3.7. Show that 
dt 


_4 1/Ja 
=i Ja+eyae +e) = (+ (at) +h) 


(Break (5.3.8) into [ee + Ve and substitute ¢ = x/s in the second piece.) 


5.3.8. With 2’ = /1— 2”, show that M(1+2,1—2) = M((1+2’)/2, V2’). 


5.3.9. Show that 


1 1 
<4, O0<a<il. 


M(1,2)  Q(a)|~ 
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5.3.10. Show that 


ae) =30(74 


5 ): O0<a<l. 


5.3.11. Show that M(1,-) : (0,1) — (0,1) and Q: (0,1) > (0, co) are strictly 
increasing, continuous bijections. 


5.3.12. Show that for each a > 1, there exists a unique 1 > b = f(a) > 0, 
such that M(a,b) = 1. 


5.3.13. With f as in the previous Exercise, use (5.3.7) to show 
f(a) ~ 4ae77/?, a — oo. 
(Let « = b/a = f(a)/a and take logs of both sides.) 


5.3.14. Given reals a1,...,@q, let p1,...,pq be given by 


d d 
(w +41) (w +43)...(@ +44) = 2+ (‘ne ote (," 1)pns +e 


Then p,...,pq are polynomials in a),...,aq, the so-called elementary sym- 
metric polynomials.? Show that 


pe(1,1,...,1) =1, l<k<d, 


py is the arithmetic mean (a1+---+aq)/d and pq is the product a1aq...aq. For 
@1,...,@q positive, conclude (Exercise 3.3.28) the arithmetic and geometric 
mean inequality 


a, +++ + Ga 1/d 

——— > (aya...aa) / ; 
with equality iff all the a;’s are equal. 
5.3.15. Given ay > ag > +: > aa > 0, let a = pi(ai,...,aa) be their 
arithmetic mean and let a/, = pa(ai,... ,aqa)'/* be their geometric mean. Use 


Exercise 3.2.10 to show 


5.3.16. Given a, > a2 >--- > aq > 0, let 


1/2 1/d 
(Gis Yo veneg lla) = Gla Oeirencia) = (pi, ps! ye pit ), 
(a), a5, — = sgt ) = G?(a1, a2, => ., ad) = G(a}, a9, = sy Geg)y 


5 More accurately, the normalized elementary symmetric polynomials 
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and so on. This defines a sequence 
(a, a, ...,a%) = G"(a1, a2,..., 04), n> 0. 


Show that (a{”) is decreasing, (a) is increasing, and 


(n) = 2n 
a(S)" ne 
ay d ad 


Conclude that there is a positive real m such that 
al” > m n>oo,l<j<d, 


(Exercise 3.3.28 and Exercise 5.3.15). If we set m = M(ai,...,@a), show 
that : 
1/2 1 
M (aj, a2,...,@a) = M (p12! ,.- spy ) . 


An integral formula for M(a1,a2,...,@a) analogous to that of Theorem 5.3.1 
is not known. 


5.4 The Gaussian Integral 


In this section, we derive the Gaussian integral 
i en? 2 dy = Vr. (5.4.1) 


This formula is remarkable because the primitive of e-*’/2 cannot be 
expressed in terms of the elementary functions (i.e., the functions studied 
in Chapter 3). Nevertheless the area (Figure 5.4) of the (total) subgraph of 
e-® /2 ig explicitly computable. Because of (5.4.1), the Gaussian function 


g(x) = en @ 21 27 has total area under its graph equal to 1. 


Fig. 5.4 The Gaussian function 


The usual derivation of (5.4.1) involves changing variables from Cartesian 
coordinates (x, y) to polar coordinates in a double integral. How to do this is 
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a two-variable result. Here we give an elementary derivation that uses only 
the one-variable material we have studied so far. To derive (5.4.1), we will, 
however, need to know how to “differentiate under an integral sign.” 

To explain this, consider the integral 


F(a) = [re t) dt, a<«<b, (5.4.2) 


where f (x,t) = 3(22+t)? and a < b, c < dare reals. We wish to differentiate 
F. There are two ways we can do this. The first method is to evaluate the 
integral obtaining F(x) = (2x + d)® — (2x + c)? and then to differentiate to 
get F’(x) = 6(2x + d)? — 6(2x +c). The second method is to differentiate 
the integrand f (x,t) = 3(2x+t)? with respect to x, obtaining 12(2x +t) and, 
then to evaluate the integral via 12(2x++t) dt, obtaining 6(2x+d)?—6(2x%+c)?. 
Since both methods yield the same result, for f (x,t) = 3(2x+t)?, we conclude 
that 
“ of 
F'(2) = Bq th!) dt, a<a<b, (5.4.3) 


where the partial derivative Of /Ox(x,t) is the derivative with respect to x, 


f(z’, t) = F(z,t) 


a<a2<b. 
0. Foe a x’ —2x : 


It turns out that (5.4.2) implies (5.4.3) in a wide variety of cases. 


Theorem 5.4.1 (Differentiation Under the Integral Sign). Let f : 
(a,b) x (c,d) > R be a function of two variables (x,t), such that 
of 


Bq ht)» a<a<bec<t<d, 


exists. Suppose there is an integrable function g : (c,d) > R, such that 
of 
|f(x,t)| + 9q 6h) < g(t), a<u<be<t<d. 


Then f(a,-) and Of /Ox(x,-) are integrable for alla <a < b. If g, f(z,-), 
a<a«a<_b, and Of /Ox(z,:),a< a <b, are continuous, then F': (a,b) + R 
given by (5.4.2) is differentiable on (a,b) and (5.4.3) holds. 

Note that the domination hypothesis guarantees that F(x) and the right 
side of (5.4.3) are well defined. Let us apply the theorem right away to ob- 
tain (5.4.1). 

To this end, let J = i e~*’/2 ds be half the integral in (5.4.1). Since 
(s — 1)? > 0, —s?/2 < (1/2) — s. Hence, I < f° e@/2)-* ds = Ve. Thus, I is 
finite and 65 as 

i -1[ err a= [ e P/F dt. (5.4.4) 
0 0 
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Now set 
t-tan 0 
fee | e* /2 ds, 0<t<w,0<6<7/2. 
0 


Since tan(7/2—) = oo, by continuity at the endpoints, 
f(x/2-,t)=e"/7T, t>0. 
Now let 
= [Hone 0<0<7/2. 
Since f(6,t) < Ie~’/? and g(t) = Ie~“/? is integrable by (5.4.4), by the 


dominated convergence theorem, we obtain 


F(x/2—-)= lim i‘ f(6,t) dt = [ f(x/2-,t) dt =I’. 
07 /2— 

Thus, to evaluate I?, we need to compute F(0). Although F(6) is not directly 

computable from its definition, it turns out that F’(6) is, using differentiation 

under the integral sign (Figure 5.5). 


(1,0) 


Fig. 5.5 The region of integration defining F(@) 


To motivate where the formula for F comes from, note that the formula for 
I? can be thought of as a double integral over the first quadrant 0 < s < 00, 
0 <t < o, in the st-plane, and the formula for F(@) can be thought of as 
a double integral over the triangular sector 0 < s < t- tan6, 0 < t < ov, in 
the st-plane. As the angle 6 opens up to 7/2, the triangular sector fills the 
quadrant. Of course, we do not actually use double integrals in the derivation 
of (5.4.1). 

Now by the fundamental theorem and the chain rule, 


of 


00 = (9, t) = e~# (tan? 0)/24 002g = et” 800” 9/24 see? Q, 
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We verify the hypotheses of the theorem on (0, b) x (0,00), where 0 < b < 7/2 
is fixed. Note, first, that f(6,t) and Of /00 are continuous in (@,t). Moreover, 


O< f(0,t)<Ie*/?, 0<df/00(0,t) < e~”/*tsec? b 


(sec? @ > 1 is increasing on (0,7/2)). So we may take g(t) = ePaT + 
t sec? b), which is integrable.° This verifies all the hypotheses. Applying the 
theorem yields 


Fo) = | chat elt sec Oat | e“du=1, 0<0<b. 
0 0 


Here we used the substitution u = t? sec? 0/2, du = tsec? 6 dt. Since 0 < b< 
m/2 is arbitrary, F’(0) = 1 is valid on (0, 7/2). 

Thus, F'(@) = @+constant on (0,7/2). To evaluate the constant, note that 
f(0+,t) = 0 for all 0 < t < o, by continuity at the endpoints. Then since 
f(0,t) <I ent / 2 we can apply the dominated convergence theorem to get 


F(0+) = im, [ (6,1) at = | f (0+, ¢) dt = 0. 


This shows that F'(0) = 0, so F(17/2—) = 1/2. Hence, I? = 7/2. Since I is 
half the integral in (5.4.1), this derives (5.4.1). 
Let us apply the theorem to the gamma function 


I'(a) =i er dh, x>0. 
0 


To this end, fix 0 < a < b < co. We show that I is differentiable on (a,b). 
With fief) =e", 


0 
cE (at) = 4 loge, 0<t<w,0<24%<o@. 


Then f and Of /0z are continuous on (a, 0) x (0, 00). Since | f|+|Of/0z| < g(t) 
on (a,b) x (0,00), where 
(t) = et2-1(|logt]/ +1), 0<t¢<1, 
oe e*t?-1(|logt| +1), 1<¢t, 
and g is continuous and integrable over (0,00) (Exercise 5.1.11), the domi- 


nation hypothesis of the theorem is verified. Thus, we can apply the theorem 
to obtain 


I(x) = | et?! log t dt, a<u<b. 
0 


6 fen g(t) dt = I? + sec? b. 
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Since 0 < a < b are arbitrary, this shows that I is differentiable on (0,00). 
Since this argument can be repeated, 


Pee) =| ae" logit)" az, “> 0: 


Since this last quantity is positive, we see that I is strictly convex on (0, 00) 
(§3.3). Differentiating repeatedly we obtain I”) (x) for all n > 1. Hence, the 
gamma function is smooth on (0,00). 

In Exercise 5.2.2, the Laplace transform 


ie int 1 
F(a) = | eet dt = arctan (=) 
0 t x 


is computed for x > 1 by expanding sint/t in a series. Now we compute F'(2) 
for x > 0 by using differentiation under the integral sign. In Exercise 5.4.12, 
we need to know this for x > 0; x > 1 is not enough. Note that, to compute 
F(a) for « > 0, it is enough to compute F(a) for x > a, where a > 0 is 
arbitrarily small. 

First, Of /Oxz = —e~*' sint, so f and Of /Ox are continuous on (a, oo) x 
(0,00). Since sint and sint/t are bounded by 1, | f(a, #)| + |Of/Oz| is domi- 
nated by 2e~* on (a, 00) x (0,00). Applying the theorem and Exercise 4.4.9 
yields 


i 1 
F'(z) =— e* sint dt = -——, xL>a. 
(0) 1+ 2? 


Now by the dominated convergence theorem, F'(co) = limz-,.~ F(x) = 0. So 
F(a) = F(x) — F(w) = -| F(t) dt 
oo 1 
= arctant|, = 7/2 — arctanz = arctan (=) ; x> a. 
x 


Since a > 0 is arbitrarily small, this is what we wanted to show. 
Now we derive the theorem. To this end, fix a < x < b, and let 7, > 2, 
with x, # x for all n > 1. We have to show that 


Flea) — F(z) > [ enae 


Ln —2£ Ox 
Let 
ken(t) = £&emt) = Fle t) c<t<dn>l, 
Ln — 2x 
and 5 
Roa Gd, ~2lred: 
Ox 
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Then k,,(t), n > 1, and k(t) are continuous on (c,d). By the mean value 
theorem 
of 


~ Ox 


for some at” between xz, and x. By the domination hypothesis, we see that 
|kin (t)| < g(t). Thus, we can apply the dominated convergence theorem, which 
yields 


kn (t) (Zod): ce<t<dne= 1, 


In —2 


This establishes (5.4.3). 
Now we compute Ge 1/2). 


M (1. =) ao) Von. (5.4.5) 


To this end, bring in the beta function® 


Theorem 5.4.2. 


1 
B(z,y) = | 11 — 49" dt, x>0,y>0. (5.4.6) 
0 
The next result shows that 1/B(x,y) extends the binomial coefficient (**¥) 
to nonnatural x and y. 
Theorem 5.4.3. For alla >0 and b> 0, 


I(a)I(b) 


B(a,b) = Flatb) 


(5.4.7) 
We derive this following the method used to obtain (5.4.1). First, write 
PO ria)= | I'(b)e~*t*—" dt (5.4.8) 
0 


and 


ine ser a e °s°-" ds 
0 


love) 
= | e St gb-lya-1 ds 
0 


Co 
= / e"(r —t)’-142-! dr, t>0. 
t 


7 x’ also depends on t. 


8 B(z,y) is finite by (5.4.7). 


5.4 The Gaussian Integral 229 
Here we substituted r= s+t, dr = ds. Now set 


A(t,r) = e—"(r —t)?- 1413 


fla,t) = f h(t, r) dr, t>0,0<a<1, 
t 


and ” 
F(o)= | f (a, t) dt, O0<a<l. 
0 


By continuity at the endpoints (the integrand is nonnegative), f(1—,t) = 
f° h(t, r) dr = e*t*!T'(b). Then (5.4.8) says 


‘a f(1-,t) dt = P(a)I'(b). 


Since f(x,t) < f(1—-,t) for 0 < x < 1 and f(1-,t) is integrable, the domi- 
nated convergence theorem applies, and we conclude that F(1—) = I'(a)I'(b). 
Moreover, F'(0+) = 0. To see this, note, by continuity at the endpoints 
(the integrand is integrable), that we have f(0+,t) = 0 for all t > 0. By the 
dominated convergence theorem, again, it follows that F(0+) = 0. 
Now by the fundamental theorem and the chain rule, 


FF (5,1) = en Ht N(t fa — a)? (-3) 


t a+b-1 1 
= (=) eee Ti ag =? 5 —. (5.4.9) 
hence (0 < & < 1) 
of e  ttatb-1 1 b 
ede < ——_-[--1]}. 5.4.10 
Fie < ae (- ) ( ) 


Fix 0 < « < 1 and suppose « < x < 1—e. Then the function 2(1 — 2) 
is minimized on « < x < 1 —e at the endpoints, so its minimum value is 
e(1 — €). The maximum value of the factor ((1/xz) — 1)® is attained at x = € 
and equals ((1/¢) — 1)’. Hence if we set C. = ((1/e) —1)’/e(1 —€), we obtain 
) 
icy £025, gee“ leet, 

Ox 

Thus, the domination hypothesis is verified on (e,1 — €) x (0,00) with® 
g(t) = f(1—,t) + C.e't**°-!. Differentiating under the integral sign and 
substituting t/x = u, dt/x = du, 


9 [°° g(t) dt = F(a)I(b) + CeP(a +b). 
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oO t a+b—-1 1 
P(e) = / (=) e V*g* (1 —a)1 de 
0 x 
- | yer ele tge—1 ay — 2)’ du = a? 1 (1 — 2)? F(ab), 
) 
valid on (€, 1 — €). Since € > 0 is arbitrary, we obtain 
F" (2) = 2*1(1— 2) 'r(a+)), O<a<l. 
Integrating, we arrive at 
1 
I'(a)I'(b) = F(1—) — F(0+) = PF" (a) dx 
) 


= ra+6) f a?—1(1 —«)’-1 dx = (a +b)B(a,b), 


which is (5.4.7). 
To derive (5.4.5), we use (5.4.7) and a sequence of substitutions. From 


§5.3, 
mf2 - do 
M(1,1/¥V2) 0 ,/1— sin? 0 


Substituting sin = t, we obtain 


afd va 
M(1, M(i, 1/2) _ ao (2-2?) 
Now substitute 2? = t?/(2 — t?) to obtain 


fz vf — 
M(I, M(i,1/V2) _ er 


Now substitute u = 2* to get 


aitg A [wen wma) 


Since 
B € 5) — F(/4)P/2) 
4° 2) I'(3/4) 
and (Exercise 5.4.1) ['(1/2) = Vz, we obtain (5.4.5). 
We end the section with an important special case of the theorem. Suppose 
that (c,d) = (0,00) and f(z,t) is piecewise constant in ¢, i.e., suppose that 
f(a,t) = fr(z),a<a<b,n-1<t<n,n>1. Then the integral in (5.4.2) 
reduces to an infinite series. Hence, the theorem takes the following form. 
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Theorem 5.4.4 (Differentiation Under the Summation Sign). Let f,, : 
(a,b) > R, n > 1, be a sequence of differentiable functions. Suppose that 
there is a convergent positive series )~> gn of numbers, such that 


lPO@Nl+|A@)|<on, a<e<dn> 1 
If 
F(z) => fa(x), a<x<b, 
n=! 


then F is differentiable on (a,b), F’ : (a,b) > R is continuous, and 
P@=>(f@). «x e<s. 


To derive this, one, of course, applies the dominated convergence theorem 
for series instead of the theorem for integrals. 
Let f : (a,b) x (c,d)  R be a function of two variables (x, y), and suppose 
that Of /Ox exists. If Of /Ox is differentiable with respect to x, we denote its 


derivative by 
am (a) ~ 5 


dz \ dx) Ax?’ 
This is needed in Exercise 5.7.6. 


Exercises 


5.4.1. Use the substitution « = V2t in (5.4.1) to obtain P'(1/2) = V7. 
Conclude that (1/2)! = /7/2. 


5.4.2. Show that the Laplace transform 


Co 
L(s) = / este ®/2 dy, —o0 <s <M, 
—oo 
is given by L(s) = Qmres /?, (Complete the square in the exponent, and use 


translation invariance.) 


5.4.3. Compute L?”") (0) with L as in the previous Exercise, to obtain 


= 2n)! 
/ en? /242 dy — V2 en) n> 0. 


Onn!’ 


(Writing out the power series of L yields L")(0).) 
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5.4.4. Show that the Fourier transform 


F(s) = i eer? cos(sx) dx, -—0 <s<o, 


is finite and differentiable on (—oo, co). Differentiate under the integral sign, 
and integrate by parts to show that F’(s)/F(s) = —s for all s. Integrate this 
equation over (0,5), and use F'(0) = V27 to obtain 


F(s)= Ire > /?, 


5.4.5. Derive the Hecke integral 
. —x2—a/ax dx -2/a 
H(a) = e€ — = V7e ; a> 0, (5.4.11) 
0 Vax 


by differentiating under the integral sign and substituting 2 = a/t to obtain 
H'(a)/H(a) = —1/Va. Integrate this equation over (0,a), and use H(0) = 
I'(1/2) = x to obtain (5.4.11). 


5.4.6. Show that 
i e124 de = /27q, q>0. 


5.4.7. Let W(t) = 7°, e-” ™, t > 0. Use the integral test (Exercise 4.3.8) 
to show that 


1 
at the) = = 


5.4.8. Show that ¢(x) = 0°, 1/n*, x > 1, is smooth (differentiation under 
the summation sign). 


5.4.9. Show that (t) = 02, e-”"*', t > 0, is smooth. 


n= 


5.4.10. Show that the Bessel function J, (Exercise 5.2.12) is smooth. If v 
is an integer, show that J, satisfies Bessel’s equation 


x Si (x) + 2J) (x) + (2? — v?)J,(x) = 0, -0 <B< oO. 
(Differentiation under the integral sign and integration by parts.) 


5.4.11. Suppose that f : R - R is nonnegative, superlinear, and continuous, 
and let 


F(s)= i e®e—F() dy, —o0 <8 < OO, 


denote the Laplace transform of e~/ (Exercise 4.3.11). Show that F is 
smooth, and compute (log F’)”. Use the Cauchy—Schwarz inequality (Exer- 
cise 4.4.21) to conclude that log F is convex. 
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5.4.12. Let F(b) = fi sinz/xdz, b > 0. From Exercise 4.3.14, we know F 
is bounded and Fcc) exists. Integrate by parts to show that 


b b 
| ese UNF de = e °° F(b) + | e °F (a) da, s>0. 
0 - 0 


Let b + oo, change variables on the right, and let s + 0+ to get 


OO 3 oe 
; _.,sing . sin x 
lim e dx = lim 

s—0+ 0 x b> co 0 


dx. 


Conclude that F'(co) = 1/2. 


5.5 Stirling’s Approximation 


The main purpose of this section is to derive Stirling’s approximation to 
n!. If (a,) and (b,) are positive sequences, we say that (a,) and (b,) are 
asymptotically equal, and we write a, ~ bp as n 7 ov, if ay/b, > 1 as 
n / oo. Note that an ~ by, as n 7 00 iff loga, — logb,;, 4 0 as n 4 co. 


Theorem 5.5.1. If x is any real, then 
P(a tn) ~ nttrV2e-/ On, no. (5.5.1) 
In particular, if x =1, we have Stirling’s approximation 
nin nt 26-"/ On, n / oO. 


A consequence of Stirling’s approximation is Raabe’s formula (Exercise 
5.5.9) which yields this Stirling’s identity 


s+1 
exp (| log I'(a) ax) = se °*V2n, s>0. (5.5.2) 


Here the left side is the geometric mean of I over (s,s +1). The subtlety 
here is the exact constant 27. Apart from this, this identity is an immediate 
consequence of the definition of I (Exercise 5.1.15). 

Note that ['(a + n) is defined, as soon as n > —«x. By taking the log of 
both sides, (5.5.1) is equivalent to 


1 1 
dn log P(e +n) — [(c+n- 5) logn ~ | = 3 les(27). 
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To derive (5.5.1), recall that 


I(a+n) = | as cami x>0. (5.5.3) 
0 


Since this integral is the area of the subgraph of e~*t?+"~! and all we want 


is an approximation, not an exact evaluation, of this integral, let us check 
where the integrand is maximized, as this will tell us where the greatest 
contribution to the area is located. A simple computation shows that the 
integrand is maximized at t = x +n -—1, which goes to infinity with n. To 
get a handle on this region of maximum area, perform the change of variable 
t= ns. This leads to 


[(x+n)= ou gee = nin f ef) 57-lds (5.5.4) 
0 0 


where 
f(s) = logs —s, s>0. 


Now the varying part e”f(>) of the integrand is maximized at the maximum 
of f, which occurs at s = 1, since f(0+) = —oo, f(co) = —oo. Since the 
maximum value of f at s = 1 is —1, the maximum value of the integrand is 
roughly e"f() = e-”, By analogy with sums (Exercise 5.5.1), we expect the 
limiting behavior of the integral in (5.5.4) to involve the maximum value of 
the integrand. Let us pause in the derivation of Stirling’s formula, and turn 
to the study of the limiting behavior of such integrals, in general. 


Theorem 5.5.2. Suppose that f : (a,b) ~ R is continuous and bounded 
above on a bounded interval (a,b). Then 


1 b 
lim log / ert (2) i] = sup{f(xz):a<a < Dd}. (5.5.5) 


To see this (Figure 5.6), let I, denote the integral, and let M = sup{f (2) : 
a<« <b}. Then M is finite since f is bounded above. Given € > 0, choose 
c € (a,b) with f(c) > M —e, and, by continuity, choose 6 > 0, such that 
f(x) > f(c) —€ on (c— 6,c+ 6). Then f(x) > M — 2e on (c— 6,c +6), and 


ct+é 


c+6 
(b—a)e™ > I, > / el) dx > / en(M—2€) de — 96 e(M—2¢)_ 
c—6 


c—6 


Now take the log of this last inequality, and divide by n to obtain 


1 1 1 
— log(b—a)+M > —log(I,) > — log(26) + M — 2e. 
n n n 


Sending n 7 oo, the upper and lower limits of the sequence ((1/n) log(In)) 
lie between M and M — 2e. Since € > 0 is arbitrary, (1/n) log(I,) — M. 
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Fig. 5.6 The global max is what counts 


Although a good start, this result is not quite enough to obtain Stirling’s 
approximation. The exact form of the limiting behavior, due to Laplace, is 
given by the following. 


Theorem 5.5.3 (Laplace’s Theorem). Let f : (a,b) > R be differentiable 
and assume f is concave. Suppose that f has a global maximum at c € (a,b) 
with f twice differentiable at c and f’(c) < 0. Suppose that g : (a,b) > R is 
continuous with polynomial growth and g(c) > 0. Then 


b 

27 
a nf (a) w~ erfle) ee oO. 5.5.6 
[« g(x) da ~e g(c) nfo)’ nf ( ) 


This result is motivated by the fact that when g(a) = 1, a= —o0, b= 0, 
and f(a) is a quadratic polynomial, (5.5.6) is an equality. 

By polynomial growth, we mean that |g(x)| < A+ B\z|?,a <a < }, 
for some constants A, B, p. Before we derive this theorem, let us apply 
it to obtain the asymptotic behavior of (5.5.4) to complete the derivation 


of (5.5.1). 
In the case of (5.5.4), f’(s) = 1/s—1, and f(s) = —1/s?, so f is strictly 
concave, has a global maximum f(1) = —1 at c= 1, and f”(1) = -1 < 0. 


Since g(s) = s*~! has polynomial growth (in s), the integral in (5.5.4) is 
asymptotic to e~"\/2a/n, which yields (5.5.1). 
Now we derive Laplace’s theorem. We write I, = I, + I2 + I+, where 


n? 


c—6 
f= [otto 


c+é 
t= i el) g(x) da, 
c—6 


b 
i al et (®) g(x) da. 
c+é 
Since c is a maximum, f’(c) = 0. Since f”(c) exists, by Taylor’s theorem 


(§3.5), there is a continuous function h : (a,b) > R satisfying h(c) = f”(c), 
and 
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f(x) = flo) + f(\(@-e) + sh(x)(x —c)? = fe) + sh(x)(a — ce)’. (5.5.7) 


If we let .(6) denote the modulus of continuity of h at c and let « = p-(d), 
then € + 0 as 6 > 0. Thus, we can choose 6 > 0, such that h(x) < f"(c) + 
pc(O) = f’(c) +e < 0 and g(x) > 0 on (c — 6,c + 6). Now substituting 
z=ct+t/Jn in 1°, dx = dt//n, and inserting (5.5.7), 


n 


5 
7° -| ert (e)tnh(a)(e—e)*/2 q(x) dx 
—6 


nfl) pov 
=e / edlett/Vn)t"/29(¢ 4 t/s/n) dt. 
Jn J_sya 


But g(a) is bounded on (c — 6,c + 0). Hence, 
eh(ett/Vn)t?/2) 9/6 4 ¢//n)| < Cel" (+6)? /2. It| < oVn, 


which is integrable!° over (—0o, 00). Thus, the dominated convergence theo- 
rem applies, and!! 


bya 
eNonyn = ehle+t/ VR) /2g (64 ¢//n) dt 
6 Vn 


af fF g(e) at = (0/75. 


Qn 
Po w ero — 5.5.8 
pre gc) aI") ( ) 


by Exercise 5.4.6. 
We conclude that 


To finish the derivation, it is enough to show that [,, ~ e as n 47 oo. 
To derive I, ~ I°, it is enough to obtain [+ /I° > 0 and I, /I° — 0, since 


In_ tn 1,8 m 
mtg Bet 


To obtain [*/I° —+ 0, we use convexity. Since f is concave, —f is convex. 
Hence, the graph of —f on (c + 6,6) lies above its tangent line at c+ 6 
(Exercise 3.3.7). Thus, 


f(x) < f(e+6) + f'(e +6) (a —c— 9), a<a2<b. 


10 The integral is C,/2m/(—f’(c) — ©). 
11 By Exercise 5.2.14 
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Since f is strictly concave at c, f/(c +46) <0 and f(c+ 4) < f(c). Inserting 
this in the definition for J,* and substituting x =t+c+6, 


b 
ir | agian y ent (e+8)(@—e-8)| a(p)) dr (5.5.9) 
c+oé6 


b—c—6 
< erf(ct5) | ert (ct 5)t 4 +Blt+c+6|?)dt (5.5.10) 
0 
< ert (ct 8) | ef (ct 5)t 4 + Blt +e+6|?) dt (5.5.11) 
0 
= Cerf (ets), (5.5.12) 


where C' denotes the (finite) integral in (5.5.9). Now divide this last expression 
by the expression in (5.5.8), obtaining 


be Z Cerf (e+9) Cerf (ct+9) 
70 — =—970 3 ~~ — eee 
nf fn ert (©) g(c) sary 


= constant: /n- e Mfle)—fler9)) 


which goes to zero as n 7 oo since the exponent is negative. Since [7 /I° is 
similar, this completes the derivation. 

Since Stirling’s approximation provides a manageable expression for n!, it 
is natural to use it to derive the asymptotics of the binomial coefficient 


n n! 
SS — <k<n. 
(1) aap "= 


Actually, of more interest is the binomial coefficient divided by 2”, since this 
is the probability of obtaining k heads in n tosses of a fair coin. 

To this end, suppose 0 < t < 1 and let (k,) be a sequence of naturals 
such that k,/n — t as n > oo. Applying Stirling to n!, k,! = (t,n)!, and 
(n — kp)! = ((1 — tn)n)! and simplifying, we obtain the following: 

Theorem 5.5.4. Fir 0 <t <1. If (kn) is a sequence of naturals such that 
the ratio kn/n— t, asn 7 co, then the probabilities (2% of tossing k = ky, 
heads in n tosses satisfy 


n 1 1 
QA a ee n 7, 
(;) V2mn /t(1—t) ue 


where 


H(t) = tlog(2t) + (1 — t) log[2(1 — ¢)], eee 1. 
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Because the binomial coefficients are so basic, the function H which gov- 
erns their asymptotic decay must be important. The function H, called the 
entropy, controls the rate of decay of the binomial coefficients. Note that H 
is convex (Figure 5.7) on (0,1) and has a global minimum of zero at t = 1/2 
with H’(1/2) = 4. 


1/2 


Fig. 5.7 The entropy H(x) 


We end the section with an application of (5.5.1) to the following formula 
for the gamma function. 


Theorem 5.5.5 (The Duplication Formula). For s > 0, 


I'(s)P'(s + 1/2) 


928 J 
I'(2s) 


= 2,/n. 

To derive this, let f(s) denote the left side. Then using ’(s +1) = sI(s), 
check that f is periodic of period 1, i.e., f(s+1) = f(s). Hence, f(st+n) = f(s) 
for all n > 1. Now inserting the asymptotic (5.5.1) (three times) in the 
expression for f(s+n) yields f(s+n) ~ 2,/m,asn 7 oo. Hence, f(s) = 2./7, 
which is the duplication formula. 


Exercises 


5.5.1. Show that 


Lipa (a” +b" +c")'/" = max(a, b,c), a,b,c > 0, 
and 


1 
lim — log (ew apr ak e”) = max(a, b,c), a,b,cER. 
nsw nN 


Moreover, if log(a,)/n — A, log(bn)/n > B, and log(c,)/n > C, then 


1 
lim — log (a, + bn + cn) = max(A, B,C). 
n 


n—->Cco 


5.5.2. Write computer code to obtain 100! and its Stirling approximation s. 
Compute the relative error |100! — s|/100!. 
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2 
5.5.3. Show that ( ") 2-7"? w1//mn asn 7 oo. 
n 


5.5.4. Apply Stirling to n!, k!, and (n — k)! to derive the asymptotic for 
n 


k 2~” given in Theorem 5.5.4. 


5.5.5. Let 0 < p < 1. Graph 
H(t, p) = tlog(t/p) + (1—t)logi(1—t)/—p)], O0<t<1. 


5.5.6. Suppose that a flawed coin is such that the probability of obtaining 
heads in a single toss is p, where 0 < p < 1. Let 0 < t < 1 and let (k,) 
be a sequence of naturals satisfying k,/n — t as n — oo. Show that the 


n 


probabilities (?)pF(1 —p)"-* of obtaining k = k, heads in n tosses satisfy 


1) Ok n-k 1 1 —nH(t,p) 
1- ~ - ————— _- € ee nf oO. 
(;,)0 = 2) V2mn ./t(1 —t) a 


5.5.7. For0<q<land0<a<b<o, let f(g) = i g® dx. Compute 
lim log f(a") 
im — . 
nZon g q 
5.5.8. Show that 


yo, POL +1/3)(8+2/3) _ 9g 


T(3s) s>0. 


Generalize to 


ans Lis)E(s + 1/n) + Ps + (n= V)/n) 


— Sm: (2m) /2 . 
Fs) Jn: (27) ; s>0 


5.5.9. Take the limit of the logarithm in the previous exercise using Riemann 
sums to get Raabe’s formula 


1 
1 
| log M(t + s) dt = slogs — s + 5 log(2m), s>0, 
0 

which implies the Stirling identity (5.5.2). 


5.5.10. Let f :R — R be superlinear and continuous. Consider the Laplace 
transforms 


L(y) =} ere) dar, n>1. 


Show that 
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where g is the Legendre transform (3.3.10) of f. (Break L(ny) into three 
pieces, as in Exercise 4.3.11, and use Exercise 5.5.1.) 


5.5.11. Differentiate the log of the duplication formula to obtain 


Pa) IG/2) 
To) FO) 


= 2log2. 


5.5.12. Use the duplication formula to get ['(1/4)I°'(3/4) = 7V2. Hence, 


5.6 Infinite Products 


Given a sequence (d,), let pn = (1+ a1) (1+ a2)...(1+4,) denote the nth 
partial product, n > 1. We say that the infinite product [[?—_, (1+ an) con- 
verges if there is a finite L, such that p, — L. In this case, we write 


L= |[ G+a,). 
For example, by induction, check that, for n > 1, 
(1+ 2) (1+27) (l+24)...(1+27"") Sltaetart+eo pepe}, 


If |x| < 1, the sum converges. Hence, the product converges to!” 


CoO 


n 1 
[[ GQ +2") =1424+27 +08 +-.-=—, ja] <1. 
n=0 = 
If [7 (1+ an) converges and L # 0, then 1+ a, = ppr/pn—-1 > L/L = 1. 
Hence, a necessary condition for convergence, when L # 0, is an — 0. 


Theorem 5.6.1. For x 4 0, 


sinh(nr) _ (1 # =) . (5.6.1) 


TX 
n=1 


12 This identity is simply a reflection of the fact that every natural has a unique 
binary expansion (§1.6). 


5.6 Infinite Products 241 


This result provides an “infinite degree” polynomial factorization of 


Tx TL 


sinh(7z) e"*—e 


TX 20x 


Since for large N (Exercise 3.2.3) e™* is approximated by the polynomial 
(1+72/2N)?%, to derive (5.6.1), it makes sense to first factor the polynomial 


2N rx \2N 
(+55) ~G-ge) 
2N 2N 
which in turn suggests we use 


es X? — 2X -cos(nm/N) +1). (5.6.2) 


—_ 
“—" 
ia 
i 
aS 


This factorization, trivial for N = 2, is most easily derived using complex 

numbers. However, by replacing X by X? in (5.6.2) and using the double- 

angle formula, one obtains (5.6.2) with 2N replacing N (Exercise 3.6.14). 

Hence, by induction, and without recourse to complex numbers, one ob- 

tains (5.6.2) for N = 2,4,8,.... In fact, this is all we need to derive (5.6.1). 
Insert X = a/b in (5.6.2) and multiply through by b?4, obtaining 


N-1 
2N _ pe _ = b?)- [a — 2ab- (= =) ae 
a (a? Il a ab: cos N + 
Now insert a: a 
=(1+55), b=(1-5) 
( + oN 2N 


Simplifying and dividing by 272, we obtain 


; Tx 2N 1 Tx 2N 
( ton a - ox) 


27x 
N-1 
1 2 pnd 
-y DD Pb-Cyl+ re bt(F)]} 
n=1 
N-1 
1 2 (nt wx? 2 (nt 
= TD [1 Ga) + ret (Ge)| 668 


where we used the double-angle formula, again. Taking the limit of both sides 
as x — 0 using l’H6pital’s rule (§3.2), we obtain 


N-1 


1= xy : [4 sin? (=) (5.6.4) 


n=1 
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Now divide (5.6.3) by (5.6.4), factor by factor, obtaining 


(43x) = (ge) +54). 


20x 2N 


where f(x) = x? cot? x. To obtain (5.6.1), we wish to take the limit N 7 oo. 
But (tanz)! = sec? x > 1. So tanz > 2, so f(x) < 1 on (0,7/2). Thus, 


sinh(mz) e7™—e"7™? a 
a = << J (45), 
TL 27x n? 

n=1 


which is half of (5.6.1). On the other hand, for M < N, 
axe \2N rxe \2N 
(+E ee 
2N Ws > Ty [1+ 5-7 (24) 
Qrx 7 : 


Since lim,49 f(z) = 1, sending N 7 oo through powers of 2 in this last 
equation, we obtain 


sinh(72) _ x? 
— 
<2 J] (1+5 


n=1 


Now let M // ov, obtaining the other half of (5.6.1). 
To give an example of the power of (5.6.1), take the log of both sides to 


get 
log (ae ae ) 3 log (1 + =). x #0. (5.6.5) 


Now differentiate under the summation sign to obtain 


1 GQ 2Q@& 
mcoth(me) — = = d ay FO. (5.6.6) 


Here coth = cosh /sinh is the hyperbolic cotangent. To justify this, note that 
log(1+t) = fo ds/(1+s) < f\ ds =t. Hence, log(1+t) < t for t > 0. Thus, 
with fn(x) = log(1 + 2?/n?), 


[fn(x)| + |fn(2)| S$ (2b+8°)/n? =n, |a| <8, 
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and >> gn < co. Thus, (5.6.6) is valid on 0 < |z| < b and hence on x ¥ 0. 


Now dividing (5.6.6) by 22, letting x \, 0, and setting t = 7a yield! 


co CoO 


) : = lim : 
n2 a\0 n2 + x 
n=1 n=1 
_ mxcoth(rx) — 1 
= lim ——————_ 
x\,0 Qx2 
wl ates tcotht — i 
t\,0 2t2 


But this last limit can be evaluated as follows. Since 


sinht | aes 


es. Ee) ae 
and . ; 
ot 
cosht=1+5,+ 4, + ‘ 
it follows that 
#2 t4 
tcotht—-1 | 1 cosht 1 ce ae eae 1 
Pie ~ 2t? \ sinh t/t 22 ee 
1+ ate aye 
tf ¢* ~~ £6 tw? ¢4— ¢6 
—+—4—-4... l+a4+5>4+5 
4 (+545 6 ) ( a a mt 
O42 ttt 
Lay ay 
t? = ¢4 t8 
2t2 t #£ 
Lh at yt ee 
De ee oe 
_ 6 60 1680 
iat t? 
tatat-.. 


Now take the limit, as t \, 0, obtaining the following: 


Theorem 5.6.2. 


13 By the monotone convergence theorem for series. 
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Recalling the zeta function 
ae 
C(x) _ S- ‘nt? u> 1, 


this result says that ¢(2) = 77/6, a result due to Euler. In fact, Euler 
used (5.6.6) to compute ¢(2n) for all n > 1. This computation involves certain 
rational numbers first studied by Bernoulli. 

The Bernoulli function is defined by 


x 
r= | oo 


1, xr=0. 


Clearly, 7 is a smooth function on « 4 0. The Bernoulli numbers B,, n > 0, 
are defined by the Bernoulli series 


B B 
T(z) = Bo t+ Bix + =” + Frat toe. (5.6.7) 
Since 1 —e~* = a — 7/2! 4 23/3!—a4*/4!+..., 
1-e* 2 3 
=1-—2/2!4+a°/3!—a°/4!+.... 
av 


Hence, to obtain the B,,’s, one computes the reciprocal of this last series, 
which is obtained by setting the Cauchy product (§1.7) 


2 3 
x xv xv _ Be 2, Bs 34 _ 


Multiplying, this leads to Bp = 1 and the recursion formula 


Bn-1 _ By_-2 
(n—1)H! (mn —2)!2! 


B 
thea hc ay =0, n> 2. (5.6.8) 


Computing, we see that each B,, is a rational number with 


1 1 1 
Re Bes Mes 
1 2? 2 6° 4 30° 
1 1 5 
6 49’ 8 30’ 10 66° 


It turns out (Exercise 5.6.2) that |B,| < 2”n!. Hence, by the root test, 
the Bernoulli series (5.6.7) converges, at least, for |a| < 1/2. In particular, 
this shows that 7 is smooth near zero. Hence, 7 is smooth on R. Let 276 > 0 


5.6 Infinite Products 245 


denote the radius'* of convergence of (5.6.7). Then (5.6.7) holds for |x| < 
278. Since 


l-e*= 2 2 i1-e* 2 eP_—e-2 2 


x e mg ite*® 2 e?+e%/? x 
= —=coth (=) 
2 
subtracting 7/2 = Byx from both sides of (5.6.7), we obtain 
x x Bn att 
= coth (=) =1+ ae 0 < |a| < 278. 


But (#/2) coth(#/2) is even. Hence, B3 = Bs = By =--- =0, and 


x x = Ban on 


Now replacing x by 27,/z and dividing by z, 


x coth (a =o sail m2 a? i 62e2 2. 
Thus, from (5.6.6), we conclude that 

1 Bon \2n n-1 we 4 

ease 2 nin = ee 0 ; 

2 nin) - ae le 


Since the left side is a power series, we may differentiate it term by term 
(§3.4). On the other hand, the right side may'® be differentiated under the 
summation sign. Differentiating both sides r — 1 times, 


Bon = - yr = 
= 2 2n , 2. 
Sending x — 0+, the right side becomes (—1)"~!(r — 1)!¢(2r), whereas the 
left side reduces to the first coefficient (that corresponding to n = r). We have 
derived the following. 


Theorem 5.6.3. For alln > 1, 


C(2n) —! a . Ban . (2n)?". 


14 Tn fact, below we see G = 1 and the radius is 2z. 
18 With fn(x) = 1/(n? +2) and gn = r!/n?, |fn(x)|+1fi(@)+---+1f8" (| < an 
and So gn = r'!¢(2). 
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As an immediate consequence, we obtain the radius of convergence of the 
Bernoulli series (5.6.7). 


Theorem 5.6.4. The radius of convergence of the Bernoulli series (5.6.7) 


is 27. 


The derivation is an immediate consequence of the previous theorem, the 
root test (§3.4), and the fact ¢(oo) = 1. 
Inserting the formula for ¢(2n) into the series for 7(27a) yields 


se = 1+mr+2¢(2)a? —2¢(4)a*+2¢(6)2°—..., || <1. (5.6.9) 
This series at once encapsulates our use of the Bernoulli function to compute 
explicitly ¢(2), ¢(4), ¢(6),...and may be used directly to compute the values 
of ¢ at the even naturals. 

A natural question is whether one can find a function whose Taylor series 
has coefficients involving all of ¢(2), ¢(3), ¢(4),.... It turns out log(a!) serves 
this purpose (Exercise 5.6.12); alas, however, log(x!) is not tractable enough 
to compute explicitly any of the odd values ¢(2n + 1), not even ¢(3). 

Above, we saw that relating an infinite series to an infinite product 
led to some nice results. In particular, we derived the infinite product for 
sinh ra/max, which we rewrite, now, as 


Te. we _ oe 
1+ fo ; + =[T (+5), x#0. (5.6.10) 


n=1 


We wish to derive the analog of this result for the sine function, i.e., we want 
to obtain (5.6.10) with —2x? replacing 2?. 
To this end, consider the following identity 


1+ bye + bo2? +---= |] (1+anz), O0<a<R. (5.6.11) 


n=1 


We seek the relations between (a,,) and (b,). As a special case, if we suppose 
that ap, = 0 for all n > 3, (5.6.11) reduces to 


1+ bya + box? = (1+ a,x) (14+ age) 


which implies 6; = a1+ a2 and b2 = a,4az. Similarly, if we suppose that a, = 0 
for all n > 4, (5.6.11) reduces to 


1+ bya + box? + b3x? = (1+ a,x) (1 + agx) (1+ a3x) 
which implies 


by =a, +02 +03 = Sai, 
Fi 


bg = ayag + a1a3 + aQa3 = ) aij, 
1<j 
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and 
b3 = 418203 = ) ajaj ak. 
i<j<k 


Theorem 5.6.5. Suppose that (an) and (bn) are positive sequences and the 
series in (5.6.11) converges on (—R, R). Suppose also (5.6.11) holds; then 


1—bietboa?—---=][ (l-anz), 0<2<R. (5.6.12) 


n=1 


We call (5.6.12) the alternating version of (5.6.11). Let us immediately 
apply this theorem to derive the infinite product for the sine. Replacing x by 
Jz in (5.6.10), we obtain 


2 2 & 
14 Tee] (145), x>0. (5.6.13) 


wa ota? = x 


But this last series is the series for sin(ma)/7a. 


Theorem 5.6.6. For x 4 0, 


snr) 2 Il (1 = =) (5.6.14) 


This is the alternating version of (5.6.1). 
To derive (5.6.12) from (5.6.11), write the finite version of (5.6.11), 


N 
140) o + BM a? +... + OY 2% = TT (1 +anz), (5.6.15) 
n=1 
and let oN) or) way om), denote the coefficients obtained by expanding 


the right side. Then 
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pW) = Sra ASH, 
as N Ao, 


ps) = S- aja; fe S- ajay = peo) 


1<i<j<N 1<i<j<oo 
as N oo, and so on. Here pf), Bf), ..., are defined as the positive infinite 


sums 


pir? _ S- ai, pir? _ ye aja;, 
i i<j 
We want to show that (b{°) equals the given sequence (,,). For this, let 


N / o in (5.6.15). Since x is positive, there is no problem with the limits 
(everything is increasing), and we get 


14+ da 4 pl? 2? Mae) O<2<R. 
1 2 


Since the coefficients of a power series are unique (§3.5), this and (5.6.11) 
yield bn = b&° for n> 1. Hence, bo’) 7 b, for all n > 1, as N 7 ov. 
Now replace x by —« in (5.6.15) to get 


1— oa $ Be? — 2. 4 (-1)%EM a = I] @-a2), O<a<R. 


Clearly, as N A oo, the right side of this last equation decreases to the right 
side of (5.6.12) (an — 0 since }> ay, < 00 since [[(1+ a,x) converges). Thus, 
to derive the theorem, it is enough to show that 


N love) 
So (-1)" 2” + S"(-1)"bnz”, = N A00. 
n=1 n=1 


But, for0<a2< R, 


N lee) N oe) 
S-(-proa" — S(- <b — 0] "+ YD da”. 
n=1 n=1 n=1 n=N+1 


Now the second sum on the right is the tail (§1.6) of a convergent series, 
hence goes to zero, as N 4 oo, whereas the first sum on the right goes to 
zero by the dominated convergence theorem for series. Indeed, the terms in 
the first sum on the right are no greater than gn = bnx” with > gn finite by 
assumption. Thus, we arrive at (5.6.12). 
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Exercises 


5.6.1. Use (5.6.9) to compute ¢(2), ¢(4), ¢(6), ¢(8). 


5.6.2. Use the recursion (5.6.8) to derive |B,| < 2"n!, n > 1. Conclude 
that the Bernoulli series (5.6.7) converges on (—1/2, 1/2). Also show that the 
numbers (Bo, B4, Be,...), form an alternating sequence (+,—,+,...). 


5.6.3. If (a,,) is a positive sequence, then 


> Qn < TI 1+a,) < exp (>: «) : (5.6.16) 
n=1 n=1 


n=1 


Conclude 


yee iff [[G + an) < 0. 


5.6.4. Use Exercise 5.1.5 to show that 


I'(«) = II =a x > 0, 
n=1 1+—- 
and 
oo | ae 
z!=e coal | : z>-tl 
n=1}/14+— 
n 


Here ¥ is Euler’s constant (Exercise 4.4.18). (Use 1+1/2+---+1/n—logn > 
yy and n® = et los”) 


5.6.5. Use Exercise 5.1.5 applied to ['(a) and (1 — x) to show that 


= si 1. 
Taras) sin(2), 0<a< 
5.6.6. Let 
= n B n n 
n=1 , 


Use Exercise 1.7.7 to show 
(2/2) cos(a/2) = B(x) sin(a/2), for |x| < 278. 


Conclude 


"a n 
5 cot (5 )= =1+ 1 2 en, 0 < |a| < min(2z7, 27). 


n=1 
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5.6.7. Use Exercise 5.6.6 to conclude that 6 < 1, i.e., the radius of conver- 
gence of the Bernoulli series (5.6.7) is no more than 27. 


5.6.8. Use (5.6.14) and modify the development leading up to (5.6.6) to 
obtain 


1 GQ W& 
m cot(72) _ x = » eae! 0< |x| <i. 


n=1 


5.6.9. Use Exercise 5.6.6 above and Exercise 3.6.13 to obtain 


— n- Ban n n a 
tang = S>(-1) . (Qn)! —— 2? (2 > tar - |x| < nB/2. 
n=1 


5.6.10. Use Exercise 5.6.4, and differentiate under the summation sign to 
get 


d I'(x) LD. ee 1 
2 pert = oe ie 
dz © (2) I(a) 7 -+>(2 —). co 


and 


d 
— I j= == —1. 
AS og(a!) vty (2 —). «> 


n=1 


W(x) =I"(x)/I (x) is the digamma function. 


5.6.11. Using the previous exercise, show 


1 qrtt a 0 ( rt = ; 
r! dati o(@!) =) Gear r ,2>-—l, 
n=1 
and 
1 att 2 1 
Agari los(2!)| s Gap t ¢(2), r>lae>-t. 


(To justify the differentiation, assume first x is in [—1 + e, 1/e].) 


5.6.12. Derive the zeta series 
1 
log(a!) = —ya + 56 (2)2° = =¢(3)2° + —¢(4)a*—..., 1>2>-1. 


Conclude 
_ $2) ¢(3) | ¢4) 
2 3° «4 
(Use the previous exercise to estimate the remainder as in Exercise 3.5.8 or 
Exercise 4.4.30.) 
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5.6.13. Use Exercise 5.6.10 to show 

I"(1) =-7 and I’(2)=1-y¥7. 
Conclude the global minimum of I(x), x > 0, lies in the interval (1, 2). 


5.6.14. Differentiate under the summation sign to obtain the Laplace trans- 
form of 7, 


d? ee 

—, log P(x) = (x) = i) e *'r(t) dt, x>0. 
dx 0 

(Exercise 5.6.10 above and Exercise 5.1.13.) 

5.6.15. Use Exercise 5.6.10 to show 


sin { ir[a-2)/2] 1 } 1 


“37T[1—2)/2]  x—-1f 2” 


5.7 Jacobi’s Theta Functions 


The theta function is defined by 


A(s) = Soe" ™ =14 Be + 2e-4™ 4. 2e- 9 4... =F >. 


—co 


This positive sum, over all integers n (positive and negative and zero), con- 
verges for s > 0 since 


ye" TS 2 yous 
—oo —oo 
oo 
—-142 S- eons 
n=1 
Qe-Fs 
ae = 


Recall (Exercise 5.1.9) the function 


w(s) = S- enn rs. 
n=1 


This is related to 0 by 6 = 1+ 2y~. The main result in this section is the 
following remarkable identity, which we need in the next section. 
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Theorem 5.7.1 (Theta Functional Equation). For all s > 0, 


6 (1/s) = Vs0(s), s>0, (5.7.1) 


which can be rewritten as 


> ent r/s — vsy> gene. s>0. (5.7.2) 


As one indication of the power of (5.7.2), plug in s = .01. Then the series 
for 0(.01) converges slowly (the tenth term is 1/e”), whereas the series for 
6(100) converges quickly. In fact, the sum 0(100) of the series on the left 
differs from its zeroth term 1 by less than 1071. 

In terms of w, the functional equation becomes 


1 +20 (1/s) = Vall + 2¥(s)]. (5.7.3) 


To derive (5.7.1), we need to introduce three power series, Jacobi’s theta 
functions, and relate them to the arithmetic-geometric mean of §5.3. These 
functions’ most striking property, double-periodicity, does not appear unless 
one embraces the complex plane. Nevertheless, within the confines of the real 
line, we shall be able to get somewhere. 

The Jacobi theta functions, defined for |q| < 1, are*® 


60(q) = Soq™ =1+2q + 2g +29°+..., 


—Cco 


co 


6_(q) = S\(-1)"@™ = 1-29 + 2g — 29° +... 


and 
2 
64(q) S gun) Qgl/* a6°/* gel oes 


—co 
By comparing these series with the geometric series, we see that they all 
converge for |g| < 1. 
The simplest properties of these functions depend on parity properties of 
integers. For example, because n is odd iff n? is odd, 09(—q) = 0_(q), since 


—Co —co 


Similarly, since 0_(q) is the alternating version (§1.7) of 60(q), 


6o(q) + 9-(q) =2 D> gh =25-q?" =2)> Com = 26 (q*) . (5.7.4) 


n even 


16 The index notation in 09, 04, 9— is not standard. 
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In the remainder of the section, we restrict q to lie in the interval (0,1). In 
this case, 09(q) > 1, 6_(qg) is bounded in absolute value by 1 by the Leibnitz 
test, and, hence 6? (q) < 62(q). 

For n > 0, let a(n) be the number of ways of writing n as a sum of 
squares, n = i? + 97, with i, 7 € Z, where permutations and signs are taken 
into account. Thus, 


o(0) = 1 because 0=0° +07, 

o(1) = 4 because = (+1)? + 0? = 0? + (41), 

o(2) = 4 because 2=(+1)?+(+1)’, 

a(3) = 0, 

o(4) = 4 because 4= (+2)? +0? = 0? 4 (+2)?, 

o(5) = 8 because 5 = (+2)? + (+1)? = (£1)? + (+2), 
a(6) = o(7) =0, 

o(8) = 4 because (+2)? + (+2), 

o(9) = 4 because 9 = (+3)? +0? = 0? + (+3)?, 
o(10) = 8 because 10 = (+1)? + (+3)? = (+3)? + (+1), 

etc. 
Then 


62(q) = (>. v“) (5.7.5) 


“(EYE 
= S- gi t# (5.7.6) 
4,jEZ 


= ya (5.7.7) 


= a(n)q” (5.7.8) 


Similarly, since n is even iff n? is even, 


Co 


(4) =D (-1)"o(n)q”. (5.7.9) 


n=0 
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Now if n = i? +97, then 2n = (i+ j)? + (i— 7)? = k? + @. Conversely, if 
2n = k? + €?, then n = ((k + £)/2)? + (k — €)/2)? =i? + 77. Thus, 


a(2n) = o(n), n>1. 
For example, o(1) = o(2) = 0(4) = 0(8). Here we used the fact that k? +? is 


even iff k + @ is even iff k — is even. Since the series (5.7.9) is the alternating 
version of the series (5.7.8), 


= 26; (¢):: (5.7.10) 
Now subtract (5.7.10) from the square of (5.7.4). You obtain 


260(q)9—(q) = [00(q) + 9-(a)]" — [98 (a) + 2 (a)] 
= 495 (a*) — 206 (a7) = 262 (97), (5.7.11) 


where the last equality is by (5.7.10) again. Rewriting (5.7.10) and (5.7.11), 
we have arrived at the AGM iteration 


95(q) _— (q) = 6 (a?) 


08 (q)02 (q) = 62 (a). (5.7.12) 


Setting a9 = 62(q) and bo = 07(q), let (an), (bn), be the AGM itera- 
tion (5.3.2),(5.3.3). Iterating (5.7.12), we obtain 


and 
n > 0. Since 0(0) = 1 = 6_(0), q?” > 0, and an + M(ao,bo), bn 
M (ao, bo), we arrive at M(ao, bo) = 1 or M (05(q), 02 (q)) =; 


Theorem 5.7.2. Suppose that (a,b) lies in the first quadrant of the ab-plane 
with a >1> b. Then (a,b) lies on the AGM curve 


M(a,b)=1 iff (a,b) = (86 (4), 62 (a)) 
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for a unique 0<q< 1. In particular, 
M (63(q),02(q)) =1,  O<q<1. (5.7.13) 


Above we derived (5.7.13). To get the rest, suppose that a > 1 > b > 0 and 
M(a,b) = 1. Since 62 : (0,1) — (1,00) is a bijection (Exercise 5.7.1), there 
is a unique q in (0,1), satisfying a = 03(q). Then by (5.7.13), M [a, 6? (q)] = 
1 = M(a,b). Since b++ M(a, b) is strictly increasing, we must have b = 6? (q). 


Now let cy, = \/a2, — b2, n > 0, be given by (5.3.4). We show that 


on = 07 (’") ,  n>0. (5.7.14) 
To this end, compute 
95 (4) — 65 (a7) = S- o(n)q” — S— o(2n)q” 
n=0 n=0 
= y  od 
n odd 
eee) 
i,j€Z 
i?+j? odd 


Now 7? +? is odd iff i+, is odd iff i—j is odd, in which case k = (j +i—1)/2 
and ¢ = (j —i—1)/2 are integers. Solving, since i = k—f, andj =k4+2+1, 
the last sum equals 
62(q) = 6 (q’) S- gh Hately? 
k LEZ 
pyrene 


kleZ 
oo . 2 
~ » (q?)* oe _ 2 (@). 
Hence, 
80 (9°) + 4 (9°) = 60(9)- (5.7.15) 


Adding (5.7.10) and (5.7.15) leads to 
85 (a°) — 6%. (a?) = @2 (a). 
Multiplying the last two equations and recalling (5.7.11) lead to 


065 (@7) = 94 (q?) + 04 (q’). (5.7.16) 
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Now replacing q? by q?” in (5.7.16) leads to (5.7.14) since a2 = b? +c2. This 

establishes (5.7.14). 
From §5.3, we know that c, — 0. Let us compute the rate at which this 

happens. It turns out that the decay rate is exponential, in the sense that 


1 
Bors on * lost) =i, log g. (5.7.17) 


To see this, let us denote, for clarity, 2" = N. Then by (5.7.14), 


log (4%. (a")) 


1 og (20°/4 + 24/4 4 2g7%/4 +...) 


1 
77 08(en) = 


= 7 Log 2 + (N/4) log g + log (1+ ¢°™ +o +...)]. 


Hence, since g¥ — 0, we obtain (5.7.17). This computation should be com- 
pared with Exercise 5.5.7. Now (5.3.20) says 


1 
tim poe (“*) =F0(2 ) 


Inserting a, — 69(0) = 1 and (5.7.17) into this equation and recalling a = 
63(q), b = 62.(q), we obtain 


- * toga =Q ea : (5.7.18) 


Here Q(x) = M(1,x)/M(1, 2’). Solving for g, we obtain the following sharp- 
ening of the previous theorem. 


Theorem 5.7.3. Suppose that (a,b) satisfies M(a,b) =1,a>1>b>0, 
and let q € (0,1) be such that (a,b) = (63(q), 02 (q)). Then 


g = eH *A00/a) (5.7.19) 


Now go back and look at (5.3.15). In Exercise 5.7.5, (5.3.15) is improved 
to an equality. 
Now let q=e 7°, s > 0, and set 


0(s) = 45 (e"**) ; 0 (s) =6 (e77°) . 
Then (5.7.18) can be written 


_ 62(s)\  M (1,6? (s)/0§(s)) 
*=0(5 ar) ~ M (1, (62.(s)/62(s))’) ’ 


s>0. (5.7.20) 
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Replacing s by 1/s in this last equation and using 1/Q(a#) = Q(2’), we obtain 


62 (1/s)\" e2(1 
e220 (a) -0(F), Saxe. 
9(1/s) 99(1/s) 
Here we use (5.7.16) to show that (67/03) = 6%. /03. Equating the last two 
expressions for s and using the strict monotonicity of Q (Exercise 5.3.11), 


we arrive at 
62(s) 62 (1/s)’ > 0. (5.7.21) 


Now we can derive the theta functional equation (5.7.1), as follows: 


s03(s) = M (ORs), 62(5)) ((5.7.13)] 
= WEE Wey [homogeneity] 
= [uM (1, (02 (s)/65(s)))| [(5.7.20)] 
= [M (1,62(s)/63(s))] 
= [M (1,62(1/s)/63(1/s))]"' —((5.7.21)] 
ee) a ee 


=@2(1/s). —_ [(5.7.13)] 


Since 09(s) = 0(s), this completes the derivation of (5.7.1). Combining 
(5.7.1) with (5.7.21), we obtain the companion functional equation 


Js0_ (e-™8) = 04 (en) . So: (5.7.22) 


Exercises 

5.7.1. Show that 0) and 6, are strictly increasing functions of (0,1) onto 
(1, 00). 

5.7.2. Derive (5.7.22). 


5.7.3. Show that 0_ is a strictly decreasing function of (0,1) onto (0,1). 
(Use (5.7.22) to compute 0_(1—).) 


5.7.4. Compute o(n) for n = 11,12,13,14,15. Show that o(4n — 1) = 0 for 
n> 1. 
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5.7.5. Let a > b > 0, let (an) and (b,) be the AGM iteration, and let q¢ be 
as in (5.7.19). Show that 


Gn — bn = 8M (a, b)q?” x (1 + oq + g +.. :) 


for n > 0. 
5.7.6. Let so 
w(t, x) = > e-” * cos(nz), t>0,2eR. 
n=1 


Show w satisfies the heat equation 


ov _ oy 
Ot Oa?” 


5.8 Riemann’s Zeta Function 


In this section, we study the Riemann zeta function 


and we discuss 


The behavior of ¢ near z = 1, 

The extension of the domains of definition of I and ¢, 
The functional equation, 

The values of the zeta function at the nonpositive integers, 
The Euler product, and 

Primes in arithmetic progressions. 


Most of the results in this section are due to Euler. Nevertheless, ¢ is 
associated with Riemann because, as Riemann showed, the subject really 
takes off only after x is allowed to range in the complex plane. 

We already know that ¢(#) is smooth for « > 1 (Exercise 5.4.8) and 
¢(1) = ¢(1+) = oo (Exercise 5.1.12). We say that f is asymptotically equal 
to g as x > a, and we write f(x) ~ g(x), as x > a, if f(x)/g(x) > 1, as 
x — a (compare with a, ~ bp, §5.5). 


Theorem 5.8.1. 
C(a) ~ ei, eol+. 
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We have to show that (a — 1)¢(4) — 1 as > 14+. Multiply ¢(a) by 2~* 
to get 


ez —— aaeeel a at. eee 1. . 1 
2-*¢(a) pet get ge ; “> (5.8.1) 
Then 
2 1 1 1 
1—— ee ae ; 8. 
( =) Se) 1 of tae ge Ps c>1 (5.8.2) 


Now by the Leibnitz test, the series in (5.8.2) converges for x > 0, equals 
log2 at x = 1 (Exercise 3.7.17), and 


He 


by the dominated convergence theorem for series (see (5.2.19)). On the other 
hand, by l’Hopital’s rule, 


_ 91-2 
lim = = log2 
x71 “2 
Thus, 
: ; z—1 2 
sip @— Dole) = Jim yore (1 ze) ce 


Thus, ¢(a) and 1/(a — 1) are asymptotically equal as x + 1+. Nevertheless, 
it may be possible that the difference ¢(x) —1/(ax—1) still goes to infinity. For 
example, x? and 2?+2 are asymptotically equal as x + oo, but (#?+2)—2? > 
oo, as x — co. In fact, for ¢(x), we show that this does not happen. 


Theorem 5.8.2. 


if cla) - : J=n (5.8.3) 


w—1+ x—-1 


where y is Euler’s constant. 


To see this, use Exercise 5.1.8 and ['(a) = (a — 1)I'(a — 1) to get, for 
o> 1, 


[ota — EG] r@) = carte) -r@=1) 


oo pe—l 09 ce 1 1 
= - ar— | eta = f i ——) dt 
0 Cr 1 0 0) et = 1 tet 
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Applying the dominated convergence theorem (Exercise 5.8.1), 


jim, (cea Bs 4) P(x) = i. (= ~ a =) dt. (5.8.4) 


But the integral in (5.8.4) is not easy to evaluate directly, so we abandon this 
approach. Instead, we use the following identity. 


Theorem 5.8.3 (Sawtooth Formula). Let f : (1,c«0) > R be differen- 
tiable, decreasing, and nonnegative, and suppose that f’ is continuous. Then 


Sy (n) = [ F(t) dt + fa + lt] DLP Oat. (5.8.5) 


n=1 


Here |¢] is the greatest integer < t, and 0 < 1+ |t|—t < 1 is the sawtooth 
function (Figure 2.3 in §2.3). To get (5.8.5), break up the following integrals 
of nonnegative functions and integrate by parts: 


[ f(t) a+ fc + |t]-—O[-f'@) at 
iy ss ff f(t)dt+ ee +n—t)[-f'(t)] av} 


-> {| sas +more — [50 ar} 


Now insert f(t) = 1/t® in (5.8.5), and evaluate the integral obtaining 


1 * Leet] —t 


We wish to take the limit « + 1+. Since the integrand is dominated by 1/t? 
when x > 1, the dominated convergence theorem applies. Hence, 


tit lec) - : j-f bea 


x—1+ xz-l t? 


But 


N 


S14 /t|—t m+ly —t 
i+ [t|-t Ai Ten S- Eee 
1 t2 N Zoo aan t? 
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This completes the derivation of (5.8.3). 

The series expression for ¢(x) is valid only when x > 1. Below we extend 
the domain of ¢ to x < 1. To this end, we seek an alternate expression for 
¢. Because the expression that we will find for ¢ involves I’, first, we extend 
I(x) tox <0. 

Recall (§5.1) that the gamma function is smooth and positive on (0,00). 
Hence, its reciprocal L = 1/I is smooth there. Since ['(a + 1) = aI (x), 
L(x) = «L(x +1). But L(x +1) is smooth on « > —1. Hence, we can use 
this last equation to define L(x) on « > —1 as a smooth function vanishing 
at « = 0. Similarly, we can use L(x) = «L(a +1) = a(a + 1)L(a + 2) to 
define L(x) on « > —2 as asmooth function, vanishing at x = 0 and x = —1. 
Continuing in this manner, the reciprocal L = 1/I of the gamma function 
extends to a smooth function on R, vanishing at x = 0, -1, —2,.... From this, 
it follows that I" itself extends to a smooth function on R \ {0,—1, —2,...}. 
Moreover (Exercise 5.8.3), 


= ae 
n! etn’ 


I(a)~ 


Ln, n> 0. (5.8.7) 


To obtain an alternate expression for ¢, start with 


co 
2 
=Sie” mh t>0, 


n=1 


use Exercise 5.1.9, and substitute 1/t for t to get, for x > 1, 
n*/2T(g/2)¢ n= fu ap(t)t?/2-} dt 
= [ w(t)t?/?- dt +f ap(t)t?/?-1 dt 
0 1 
as 1 —a/2-1 x/2-1 
- wl—|t + w(tyt dt. (5.8.8) 
1 


t 


Now by the theta functional equation (5.7.3), 


#(3) = 24 viv 5 
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So (5.8.8) leads to 


a */2T (2 /2)C(x) = 7 pe (+ - 5) dt 


“ —% x dt 
+ p(t) [2-9 +? | =. 


1 


Evaluating the first integral (recall z > 1), we obtain our alternate expression 
for ¢, 


n*/2 T(x /2)C(x) = : ctf we eo? + 22/2] . (5.8.9) 


x(a@ — 


valid for z > 1. 

Let us analyze (5.8.9). The integral on the right is a smooth function of x 
on R (Exercise 5.8.4). Hence, the right side of (5.8.9) is a smooth function 
of « £0,1. On the other hand, 1~*/? is smooth and positive, and L(x/2) = 
1/I'(a/2) is smooth on all of R. Thus, (5.8.9) can be used to define ¢(x) as 
a smooth function on x # 0,1. Moreover, since 1/aI(x/2) = 1/21 (a@/2 + 
1), (5.8.9) can be used to define ¢(x) as a smooth function near x = 0. Now 
by (5.8.7), ['(a/2)(a + 2n) > 2(—1)"/n! as 2 + —2n. So multiplying (5.8.9) 
by «+ 2n and sending x + —2n yield 


(—1)"1"(2/n!) lim. C(x) = {° pued 


x—+—2n —1 ifn =—0. 


Thus, ¢(—2n) = 0 for n > 0 and ¢(0) = —1/2. Now the zeta function ¢(x) is 
defined for all x 4 1. We summarize the results. 


Theorem 5.8.4. The zeta function can be defined, for all x £1, as a smooth 
function. Moreover, ¢(—2n) = 0 for n> 1, and ¢(0) = —1/2. 


Now the right side of (5.8.9) is unchanged under the substitution 7 1 
(1-2). This, immediately, leads to the following. 


Theorem 5.8.5 (Zeta Functional Equation). If 
E(x) = 1 *?T(x/2)¢(a), 


then 


€(z)=€(1—2), 2x#...,—4,—2,0,1,3,5,.... (5.8.10) 


Since we obtained ¢(2n), n > 1, in §5.6, plugging in 2 = 2n, n > 1, into 
(5.8.10) leads us to the following. 
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Theorem 5.8.6. For alln > 1, 


B n 
Cita Se 
2n 
For example, ¢(—1) = —1/12. We leave the derivation of this as Exer- 
cise 5.8.5. Now we know ¢(2) at all nonpositive integers and all positive 


even integers. Even though this result is over 200 years old, similar expres- 
sions for ¢(2n + 1), n > 1, have not yet been computed. In particular, very 
little is known about ¢(3). 

We turn to our last topic, the prime numbers. Before proceeding, the 
reader may wish to review the Exercises in §1.3. That there is a connection 
between the zeta function and the prime numbers was discovered by Euler 
300 years ago. 


Theorem 5.8.7 (Euler Product). For all x > 1, 


Le W(-z) 


Pp 
Here the product’ is over all primes. 


This follows from the fundamental theorem of arithmetic (Exercise 1.3.17). 
More specifically, from (5.8.1), 


1 1 1 1 


where 2 {nm means 2 does not divide n and n > 1. Similarly, subtracting 1/3” 
times (5.8.11) from (5.8.11) yields 


Continuing in this manner, 


N 

1 1 

1-—]=1 — iL 5.8.12 
“IM G-ss)a1+ Doe seh Gan 

= P1,P2,--,pN{n 

where p1,p2,.-.,pn are the first N primes. But p1,po,...,pw{nandn> 1 
implies n > N. Hence, the series on the right side of (5.8.12) is no greater 
than )77°_y4,1/n*, which goes to zero as N 7 oo. 


17 This equality and its derivation are valid whether or not there are infinitely many 
primes. 
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Euler used his product to establish the infinitude of primes, as follows: 
Since (Exercise 5.8.9) 


0 < —log(1 — a) < 2a, 0<a<1/2, (5.8.13) 


it follows that 


log ¢(x) = S- — log (1 _ =) < ry = a>. (5.8.14) 
Pp 


Pp 


Now as « + 1+, ¢(a) — co; hence, log¢(a) — oo. On the other hand, 
> 1/p* > = 1/p, as x \, 1, by the monotone convergence theorem. We 
have arrived at the following. 


Theorem 5.8.8. There are infinitely many primes. In fact, there are enough 


of them so that 
1 
Tis 
P Pp 


Our last topic is the infinitude of primes in arithmetic progressions. Let 
a and b be naturals. An arithmetic progression is a subset of N of the form 
aN +b = {a+b,2a+b,3a+b,...}. Apart from 2 and 3, every prime is either 
in 4N+1 or 4N+3. Note that p € aN +6 iff a divides p — b, which we write 
as a |p —b. Here is Euler’s result on primes in arithmetic progressions. 


Theorem 5.8.9. There are infinitely many primes in 4N+1 and in 4N+3. 
In fact, there are enough of them so that 


and 


4|p—3 
We proceed by analogy with the preceding derivation. Instead of relating 


Dp 1/p* to log ¢(x), now, we relate )/4),,_1 1/p® and 74),_3 1/p* to log Li(z) 
and log L3(x), where 


and 


1 
L = — : 
3(z) S- n®? x>i1 
4|n—3 
By comparison with ¢(), the series Li(a) and L3(x) are finite for 7 > 1. To 
make the analogy clearer, we define yx; : N > R and x3: N > R by setting 
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(n) 1, ne 4N+l1, 
n)= 
ao 0, otherwise, 


and 
1, ne 4N+83, 


xa(n) = 3 otherwise. 


Then 


In(x) = 3 xaln) L(x) = : xa(n) a>. 


n nm 


Proceeding further, the next step was to obtain an identity of the form 


3 x(n) =I] 1 = ole a>. (5.8.15) 


av 
rs Pp 


Denote the series in (5.8.15) by L(x, x). Thus, Li (a) = L(x, x1) and L3(a) = 
L(x, x3). When x = x1 or x = x3, however, (5.8.15) is false, and for a very 
good reason, x; and y3 are not multiplicative. 


Theorem 5.8.10. Suppose that x : N — R is bounded and multiplicative, 
i.e., suppose that x(mn) = x(m)x(n) for allm,n€N. Then (5.8.15) holds. 


The derivation of this is completely analogous to the previous case and 
involves inserting factors of x in (5.8.12). Having arrived at this point, 
Euler bypassed the failure of (5.8.15) for v1, x3 by considering, instead, 


X+ = X1+X3, 
and 
X- = X1— X3- 


Then y+(n) is 1 or 0 according to whether n is odd or even, L(x, y+) is given 
by (5.8.11), y— is given by 


1, 4|n—-1, 
X= (n) = =I, 4|n—3, 
0, nm even, 
and 7 ; f 
L -)=1-—+—-—H.... 


But this is an alternating, hence convergent, series for x > 0 by the Leib- 
nitz test, and L(1,,_) > 0. Moreover (Exercise 5.8.11), by the dominated 
convergence theorem, 


lim L(x, x_) = L(1,x_) > 0. (5.8.16) 
2 
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Now the key point is that y, and y_ are multiplicative (Exercise 5.8.10), 
and, hence, (5.8.15) holds with v = y+. 
Proceeding, as in (5.8.14), and taking the log of (5.8.15) with y = y+, we 
obtain 
(p) 


log L(a, x4) < oa 
Pp 


z>l. 


Since, by (5.8.11), limz_.14 L(x, y+) = 00, sending x + 1+, we conclude that 


x+(p) _ 
J = (5.8.17) 


—x@ 


Turning to y_, we claim it is enough to show that >/,, x-(p)p~* remains 


bounded as x — 1+. Indeed, assuming this claim, we have 


1 1 
ypu 


A|p—1 a 
— tim Yo XL) 
aN\y1 rs p 
me: x+(P) x-(P) 
= lim = + = 00 
r\1 2 p™ ye p™ 


by the monotone convergence theorem, the claim, and (5.8.17). This is the 
first half of the theorem. Similarly, 


L_ yr xs) 
dp ye p 
x3(p 

ee 


x+(P x~(p) 
= lim — 


This is the second half of the theorem. 
To complete the derivation, we establish the claim using 


|— log(1 — a) — a] < a’, lal < 1/2, (5.8.18) 


which follows from the power series for log (Exercise 5.8.9). Taking the log 
of (5.8.15) and using (5.8.18) with a = y_(p)/p*, we obtain 


oo 
xX- 1 1 
log L(x, y_) <5 re 2? x> i. 
p n=1 
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By (5.8.16), log L(x, x) — log Z(1,x—) and so remains bounded as x —> 
1+. Since, by the last equation, >7,, x-(p)p~* differs from log L(x, x—) by a 
bounded quantity, this establishes the claim. 

One hundred years after Euler’s result, Dirichlet showed!® there are in- 
finitely many primes in any arithmetic progression aN + b, as long as a and 
b have no common factor. 


Exercises 


5.8.1. Use the dominated convergence theorem to derive (5.8.4). (Exer- 


cise 3.5.7.) 
as 1 1 
= —-— }dt 
2 | (= -1 =z) 


5.8.4. Dominate 7 by a geometric series to obtain w(t) < ce~™’, t > 1, where 
c = 1/(1—e7"). Use this to show that the integral in (5.8.9) is a smooth 
function of x in R. 


5.8.2. Show that 


5.8.3. Derive (5.8.7). 


5.8.5. Use (5.8.10) and the values ¢(2n), n > 1, obtained in §5.6, to show 
that ¢(1 — 2n) = —Bo,/2n, n> 1. 


5.8.6. Let I(x) denote the integral in (5.8.6). Show that I() is finite, smooth 
for « > 0, and satisfies I’(~) = —(a@ + 1)I(a +1). Compute I(2). 


5.8.7. Use (5.8.9) to check that (« — 1)¢(x) is smooth and positive on (1 — 
6, 1+6) for 6 small enough. Then differentiate log|(«—1)¢(x)] for 1 < a < 1+6 
using (5.8.6). Conclude that 


‘ 1 
lim can + —| =7. 
5.8.8. Differentiate the log of (5.8.10) to obtain ¢’(0) = —4 log(27). (Use the 
previous Exercise, Exercise 5.5.11, and Exercise 5.6.15.) 


5.8.9. Derive (5.8.13) and (5.8.18) using the power series for log(1 + a). 


5.8.10. Show that x4 : N > R are multiplicative. 


5.8.11. Derive (5.8.16) using the dominated convergence theorem. (Group 
the terms in pairs, and use the mean value theorem to show that a~* —b~* < 
z/a®*!, b>a>0.) 


18 By replacing y+ by the characters y of the group (Z/aZ)*. 
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5.9 The Euler—Maclaurin Formula 


Given a smooth function f on R, can we find a smooth function g on R 
satisfying 


a+l1 
f(a+1)— f(a) = / g(t) dt, ce R? (5.9.1) 


By the fundamental theorem, the answer is yes: g = f’, and g is also smooth. 
However, this is not the only solution because g = f’ + p’ solves (5.9.1) for 
any smooth periodic p, i.e., for any smooth p satisfying p(a + 1) = p(«) for 
alla ER. 

The starting point for the Euler—Maclaurin formula is to ask the same 
question but with the left side in (5.9.1) modified. More precisely, given a 
smooth function f on R, can we find a smooth function g on R satisfying 


atl 
f(a+l1l)= / g(t) dt, xe R? (5.9.2) 


Note that g = 1 works when f = 1. We call a g satisfying (5.9.2) an Euler— 
Maclaurin derivative of f. 

It turns out the answer is yes and (5.9.2) is always solvable. To see this, 
let q denote a primitive of g. Then (5.9.2) becomes f(a) = q(x) — g(a — 1). 
Conversely, suppose that 


f(x) = q(x) — q(x —- 1), ceER, (5.9.3) 


for some smooth gq. Then it is easy to check that g = q’ works in (5.9.2). 
Thus, given f, (5.9.2) is solvable for some smooth g iff (5.9.3) is solvable for 
some smooth q. 

In fact, it turns out that (5.9.3) is always solvable. Note, however, that q 
solves (5.9.3) iff g+p solves (5.9.3), where p is any periodic smooth function, 
i.e., p(w +1) = p(x), « € R. So the solution is not unique. 

To solve (5.9.3), assume, in addition, that f(a) = 0 for « < —1, and define 
q by 


f(z), x <0, 
q(x) = f(z) +f(e2—1)+ f(e—2), 1<a2 <2, (5.9.4) 
and so on. 


Then gq is well defined and smooth on R and (5.9.3) holds (Exercise 5.9.1). 
Thus, (5.9.3) is solvable when f vanishes on (—oo, —1). Similarly, (5.9.3) is 
solvable when f vanishes on (1,00) (Exercise 5.9.2). 

To obtain the general case, we write f = f, + f_, where f; = 0 on 
(—oo, -1) and f_ = 0 on (1,00). Then g = qi + q_ solves (5.9.3) for f if qt 
solve (5.9.3) for f4. Thus, to complete the solution of (5.9.3), all we need do 
is construct f+. 
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Because we require f;, f— to be smooth, it is not immediately clear this 
can be done. To this end, we deal, first, with the special case f = 1, i.e., we 
construct ¢4 smooth and satisfying 64 = 0 on (—oo, —-1), d- = 0 on (1,00), 
and1=¢,+¢_ onR. 

To construct ¢+4, let h denote the function in Exercise 3.5.2. Then h : 
R = Ris smooth, h =0 on R-, and h > 0 on R?. Set 

h(1+ 2) 


= Tamayrnieay TSE 


and Ace 


CAO) a) = eee) 
Since h(1 — x) +h(1 +2) > 0 on all of R, $4 are smooth with ¢, = 0 on 
(—oo, -1), d- = 0 on (1,00), and ¢, + ¢_ =1lonall of R. 
Now for smooth f, we may set fi = fd+, yielding f = fy, + f_ on all of 
R. Thus, (5.9.3) is solvable for all smooth f. Hence, (5.9.2) is solvable for all 
smooth f. 


creER. 


Theorem 5.9.1. Any smooth f on R has a smooth Euler-Maclaurin deriva- 
tive g on R. 


Our main interest is to obtain a useful formula for an Euler—Maclaurin 
derivative g of f. To explain this, we denote f’ = Df, f” = D?f, f’” = D*°f, 
and so on. Then any polynomial in D makes sense. For example, D? + 2D? — 
D+5 is the differential operator that associates the smooth function f with 
the smooth function 


(D3 +2D? —D+5)f =f" +2f" — fi +5f. 


More generally, we may consider infinite linear combinations of powers of D. 
For example, 


et F(c) = (1 +tD+ ms + i +... ) f(c) (5.9.5) 
2 3 
eee oe at") ES at) ti, (5.9.6) 


may sum to f(c+t), since this is the Taylor series, but, for general smooth 
f, diverges from f(c +t). When f is a polynomial of degree d, (5.9.5) does 
sum to f(c +t). Hence, e’’ f(c) = f(c +t). In fact, in this case, any power 
series in D applied to f is another polynomial of degree d, as D" f = 0 for 
n > d. For example, if B,, n > 0, are the Bernoulli numbers (§5.6), then 
B B 

r(D) =1+BiD+ TP . 7" Pei (5.9.7) 
may be applied to any polynomial f(x) of degree d. The result 7(D)f (zx), 
then obtained is another polynomial of, at most, the same degree. 
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If t(D) is applied to f(x) = e*” for a real, we obtain 
T(D)e™ = r(a)e™, (5.9.8) 
where 7(a) is the Bernoulli function of 85.6, 


T(a) Ai pop Baye PA hs, : 
2! 4! 
Thus, (5.9.8) is valid only on the interval of convergence of the power series 
for T(a). 
Let c(a) be a power series. To compute the effect of c(D) on a product 
e** f(x), where f is a polynomial, note that 


Dea] = axe®™” +e” = e* (ax +1) 
by the product rule. Repeating this with D?, D?,..., 
D” es) =e" (ae +na"); 


Taking linear combinations, we conclude that 


c(D) (ex) = e™[e(a)a + e'(a)] = = [c(a)e™]. 
Thus, c(D)(e**x) is well defined for a in the interval of convergence of c(a). 
Similarly, one checks that c(D)(e**x”) is well defined for any n > 1 and a in 
the interval of convergence (Exercise 5.9.3) and 
axr,n a” ax 
c(D) (e** 2") = — |c(a)e*"]. (5.9.9) 
Oa” 

We call a smooth function elementary if it is a product e“” f(a) of an exponen- 
tial e** with a in the interval of convergence of r(a) and a polynomial f(z). 
In particular, any polynomial is elementary. Note that r(D)f is elementary 
whenever f is elementary. 


Theorem 5.9.2. Let f be an elementary function. Then t(D) f is an Euler 
Maclaurin derivative, 


atl 
fle+1) =} (D)f()dt, «ceR. (5.9.10) 
To derive (5.9.10), start with f(x) = e*”. If a = 0, (5.9.10) is clearly true. 


If a £0, then by (5.9.8), (5.9.10) is equivalent to 


a+1 a(z+1) _ pax 
sor) = / r(a)e" dt = r(a) «= 


a 


5.9 The Euler—Maclaurin Formula 271 


which is true since 7(a) = a/(1 — e~°). Thus, 
etl 
eet) — i t(a)e™ dt, a€R,veR. 


Now apply 0”/0a” to both sides of this last equation, differentiate under the 
integral sign, and use (5.9.9). You obtain (5.9.10) with f(#) = e*°x”. By 
linearity, one obtains (5.9.10) for any elementary function f. 

Given a < b with a,b € Z, insert x = a,a+1,a42,...,b—1 in (5.9.10), 
and sum the resulting equations to get the following. 


Theorem 5.9.3 (Euler—Maclaurin). For a < b in Z and any elementary 
function f, 


b 
» fn) = [ T(D) f(t) dt. (5.9.11) 


a<n<b 


The derivation of this is a triviality. The depth lies in the usefulness of 
the result. This arises from the fact that (5.9.11) equates a discrete sum of f 
on the left with a continuous sum of a related function 7(D)f on the right. 
Indeed, the tension between the discrete and the continuous is at the basis 
of many important mathematical phenomena. ! 

By inserting a = 0, b = 00, and f(t) =1/(a+t)?, x fixed, in (5.9.11), one 
can derive a sharpening of Stirling’s approximation (85.5), the Stirling series 
for log I(x). Since this f is not elementary, here, one obtains a divergent 
series T(D)f. Instead of starting with (5.9.11), however, it will be quicker for 
us to derive the Stirling series from the identity (Exercise 5.6.14) 


2 ee) 
a log I(x) =] e *'r(t) dt, x> 0. (5.9.12) 
da? 0 
But, first, we discuss asymptotic expansions. 

Let f and g be defined near = c. We say that f is big oh of g, as 
x — c, and we write f(x) = O(g(x)), as > ¢, if the ratio f(x)/g(x) is 
bounded for x 4 c in some interval about c. If c = oo, then we require that 
f(x)/g(x) be bounded for x sufficiently large. For example, f(x) ~ g(x), 
as © — c, implies f(x) = O(g(x)) and g(x) = O(f(x)), as « > c. We 
write f(x) = g(x) + O(A(x)) to mean f(x) — g(x) = O(A(ax)). Note that 
f(x) = O(A(x)) and g(x) = O(h(x)) imply f(x) + g(@) = O(h(@)) or, what 
is the same, O(h(x)) + O(h(x)) = O(h(2)). 

We say that 

f(z) Sap +aix+agzr?+..., (5.9.13) 


is an asymptotic expansion of f at zero if 
f(a) =a t+ayxt ++ +an2"+O(a"""), x— 0, (5.9.14) 


19 Ts light composed of particles or waves? 
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for all n > 0. Here there is no assumption regarding the convergence of 
the series in (5.9.13). Although the Taylor series of a smooth function may 
diverge, we have the following. 


Theorem 5.9.4. If f is smooth in an interval about 0, then 


1 1 1 
f(a) & £00) + f'(O)a + FF" (Oa? + FF" Oa? + FLO Oat +... 
is an asymptotic expansion at zero. 


This follows from Taylor’s theorem. If an = f‘")(0)/n!, then from §3.5, 


F(z) = a9 + ax + ayn” laa An x” + pane. 


1 
(n+ 1)! 


with h,,41 continuous on an interval about 0, hence bounded near 0. 
For example, 


eT VII a 0, x0, 


since t = 1/a implies e~'/l#!/a2” = e— "lt" + 0 as t > too. 
Actually, we will need asymptotic expansions at oo. Let f be defined near 
oo, i.e., for « sufficiently large. We say that 


f(z)ea t+ 24+ S34... (5.9.15) 
2 & 


is an asymptotic expansion of f at infinity if 


ay An 1 
fie) =a + tet 240(55), xr 00, 


—x x 


for all n > 0. For example, e~” * 0 as  — ov, since e~* = O(a”), as 
x — oo, for all n > 0. Here is the Stirling series. 


Theorem 5.9.5 (Stirling). As 7 > «w, 


weF'e) - “[e- Deed 


Ba By Be 
5 os(2n) + in + ——, ns + 8. Bat +.... (5.9.16) 
Note that, ignoring the terms with Bernoulli numbers, this result reduces 
to Stirling’s approximation §5.5. Moreover, note that, because this is an ex- 
pression for log (x) and not I'(a), the terms involving the Bernoulli numbers 
are measures of relative error. Thus, the principal error term B2/2x = 1/12z 
equals 1/1200 for « = 100 which agrees with the relative error of .08% found 
in Exercise 5.5.2. 
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To derive (5.9.16), we will use (5.9.12) and replace r(t) by its Bernoulli 
series to obtain 


2 


d 1 
—g login) tb a ey LOO. (5.9.17) 
dx? x 


Then we integrate this twice to get (5.9.16). 
First, we show that the portion of the integral in (5.9.12) over (1,00) has no 
effect on the asymptotic expansion (5.9.17). Fix n > 0. To this end, note that 


we 1 1 
': e “tdt = ae e* 
1 1 


—el —¢é=h 2 


0 < | e *'r(t) dt < 
1 i 


for « > 0. Thus, for alln > 1 


CO a 1 
, e *r(t) dt =O (=) P wT Ow. 


Since 7 is smooth at zero and the Bernoulli series is the Taylor series of 7, 
by Taylor’s theorem (§3.5), there is a continuous h, : R > R satisfying 


Bi, B Beta HA, 
= Bot te Se} 1_yn1, Mnlt), 


alee teR. 
nm a (n— 1)! nl? = 


Then (Exercise 5.9.4), 


aes e a oa fs 1 
6 e hn{t) | dt = a ' e hn (t/a) t=O grt 


since h,(x) is bounded for 0 < « < 1. Similarly (Exercise 5.9.5), 


Sot 0 l k>0 
, e€ a= grt 5 TIw,K ZU. 


Now insert all this into (5.9.12), and use 


to get, for fixed n > 0, 


[oe] alt [oe] 
7 e ™r(t) dt = | e “'r(t) dt + / e **r(t) dt 
0 0 1 


Pee Pe t” 1 
= B edt ~*°h, (t)— dt + O | —— 
| € il +f e ()— + (=) 
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k=0 
n— n-1 
By a | ( 1 ) 
= = Br e “—dt+O 
aoe > 7 uy oe, 


Since Bo = 1, Bi = 1/2, and this is true for all n > 0, this derives (5.9.17). 
To get (5.9.16), let f(x) = (log '(x))” — (1/x) — (1/2x?). Then by (5.9.17), 


fee ee Bn +0(sa). (5.9.18) 


Since the right side of this last equation is integrable over (x,00) for any 
«> 0, so is f. Since — f* f(t) dt is a primitive of f, we obtain 


i f(t) dt = —[log I'(x)|’ + log x — ae A 
x 2x 


for some constant A. So integrating both sides of (5.9.18) over (a, 00) leads to 


1 Bg Ba Bn 1 
[log Pa)! + loge ~~ A= 3+ Fh pt SE FO )i 


Similarly, integrating this last equation over (x, 00) leads to 
1 
log I'(a) —xwlogx + a+ 5 logx + Ax — B 
Bg By By 1 
= (nie EG] o.). 
Dela 4308 * ie @— Dee (=) 
Noting that the right side of this last equation vanishes as x — oo, inserting 


x = nin the left side, and comparing with Stirling’s approximation in §5.5, 
we conclude that A = 0 and B = log(27)/2, thus obtaining (5.9.16). 


Exercises 


5.9.1. Show that g, as defined by (5.9.4), is well defined, smooth on R, and 
satisfies (5.9.3), when f vanishes on (—oo, —1). 


5.9.2. Find a smooth gq solving (5.9.3), when f vanishes on (1,00). 
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5.9.3. Let c be a power series with radius of convergence R. Show that 
c(D)(e** x") is well defined for |a] < R and n > 0 and satisfies 


5.9.4. Show that fo e~*' f(t)t” dt = O (Hr) for any continuous bounded 
f:(0,1) > Randn>0. 


5.9.5. For all n > 0 and p > 0, show that [7° e~*'t? dt 0, as x + oo. 


5.9.6. Show that the Stirling series in (5.9.16) cannot converge anywhere. 
(If it did converge at a 4 0, then the Bernoulli series would converge on all 
of R.) 


Chapter 6 
Generalizations 


The theory of the integral developed in Chapter 4 was just enough to develop 
all results in Chapter 5. Nevertheless, the treatment of the fundamental the- 
orems of calculus depended strongly on the continuity of the integrand. It 
is natural to ask if this assumption can be relaxed and to seek the natural 
setting of Theorem 4.4.2 and Theorem 4.4.3. 

It turns out that continuity can indeed be dropped and the theory in 
Chapter 4 can be pushed to provide a complete resolution. Starting one hun- 
dred years ago, Lebesgue—in his doctoral thesis Integrale, Longueur, Aire 
[11]—isolated the notion of measurability of sets and functions and used 
it to establish his generalizations of the fundamental theorems. Indeed, the 
method of exhaustion (§4.5) is a consequence of these ideas, even though it 
is presented earlier in this text out of necessity. The original treatment of 
these results was simplified and sharpened by Banach, Caratheodory, Riesz, 
Vitali, and others. These results, while providing a complete treatment of the 
fundamental theorems in one dimension, also turned out to be the spark that 
led to investigations of multidimensional generalizations. These investigations 
continue to the present day. 

Throughout this chapter, there is no presumption of measurability unless 
explicitly stated. In particular, integrability as defined in 84.3 does not 
presume measurability, and we use arbitrary in this chapter to mean not 
necessarily measurable. 


6.1 Measurable Functions and Linearity 


In §4.4, we derived linearity (Theorem 4.4.5) 


b b b 
[u@+o@a = [ flo) de+ f g(x) dx (6.1.1) 
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when f and g are both integrable or both nonnegative, as long as both f 
and g are continuous. In this section, we establish (6.1.1) when continuity is 
replaced by measurability. 

In §4.5, we defined measurable subsets M C R?. Now we define M C R to 
be measurable if M x (0,1) C R? is measurable. Since measurability in R? is 
dilation invariant (Exercise 6.1.2), this happens iff M x (0,m) is measurable 
for m positive. 

Since the intersection of a sequence of measurable sets in R? is measurable 
in R? (§4.5) and 


(N us) x (0,1) = Q M, x (0,1), 


n=1 


the same is true of measurable sets in R. Since the union of a sequence of 
measurable sets in R? is measurable in R? (§4.5) and 


(U un) «(Oy =| Jax), 
n=1 n=1 
the same is true of measurable sets in R. Since 

(R \ M) x (0,1) = (R x (0,1)) \ (M x (0,1), 


M CR measurable implies M° C R measurable. 
We say f on (a,b) is measurable if {x : f(x) < m} C (a,b) is measurable 
for all m real. Since 


{a : f(e) <m} = (){w: fle) <m+1/n}, 


and 
{a: f(a) <m} = LJ fe: fle) <m— 1/n}, 
n=1 
f is measurable iff {x : f(a) < m} C (a,b) is measurable for all m real. 
Theorem 6.1.1. If f is continuous on (a,b), then f is measurable. 
The continuity of f implies {x : f(x) < m} x (0,1) is open in R?, hence, 


measurable in R? (§4.5); hence, {x : f(x) < m} is measurable in R. Since 
this is so for all m real, f is measurable. 


Theorem 6.1.2. If f and g are measurable on (a,b), so are —f, f +9, 
and fg. 


These are Exercises 6.1.4, 6.1.5, and 6.1.6. 
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The next result indicates the breadth of the class of measurable functions. 
We say fn, n > 1, converges to f pointwise on (a,b) if fr(w) > f(x) as 
n— oo for alla<a<_b. 


Theorem 6.1.3. If fn, n > 1, are measurable and converge pointwise on 
(a,b) to f, then f is measurable on (a,b). 


These results show that the class of measurable functions includes any 
function constructed from continuous functions using algebraic or limiting 
processes, in particular every function in this text. Because of this breadth, 
it is natural to ask whether non-measurable functions exist at all. In other 
words, is every subset of R. measurable? The answer depends on the exact 
formulation of the axioms* of set theory in §1.1. 

Let 


fe(2) = inf fala), 21, 


be the lower sequence. Then 


{x: fnrx(2) << m}= ae : f(x) < m} 
k>n 


hence, fn is measurable for n > 1. Now fn«(x) 7 f(x) so 


f(x) = sup fins (2). 


n>1 


Then f(a) < m iff f(z) < m-—1/k for some k > 1 which happens iff 
fnx(2) < m—1/k for alln > 1. Thus 


{x : f(x) <m} =) () fe: fala) < m—1/k}, 


k=1n=1 


hence, f is measurable. 
Given a set A C R, the indicator function corresponding to A is the 
function 14 : R > R equal to 1 on A and 0 on A®, 


1, «eA, 
1ae)= {1 cg@A 


Then M is measurable iff ly, is measurable. Note 1g = 0. 
A function f on (a,b) is simple if its range f((a,)) is a finite set. Then f 
is simple iff 


f(z) =a,1,4,(z)+---+anlay(z), ER, (6.1.2) 


1 Including the axiom of choice. 


280 6 Generalizations 


for some @1,Q2,...,a@n real and subsets A;,Ao,...,An of (a,b) (Exer- 
cise 6.1.1). Moreover, f is measurable iff the sets A;,..., Aj can be chosen 
measurable. The sets A;, A2,..., Ax need not be nonempty, and the reals 
@1,Q2,...,@y need not be nonzero. We say (6.1.2) is in canonical form if 
Aj, Ao,..., Aw are disjoint with union (a, b). 

Theorem 6.1.4. If f is nonnegative on (a,b), there is an increasing sequence 
O0< fi < fo<---<f of simple functions converging pointwise to f. If f is 
measurable, then fn, n > 1, can be chosen measurable. 


For n > 1, let 


Ajn = {ui (G-12™ < f(w) <j}, FG = 1,2,..., 2", 


and set? 
n2” 
fa = SOG = 1)2-"1 4; 
j=l 
Since 


{a:m, < f(x) < mg} ={a: f(x) < me} N{ax: f(x) < mi}°, 


Aj.n is measurable if f is so, fy is simple, and fn, < f < fn +27"; thus, 
fn — f pointwise as n — oo. Now fix n > 1. We show fn < fn41 on Ajn, 
for 7 =1,...,n2”. Since 


Ajn = Agj—i1yn41 U Agjn41 


f(x) E Ajin iff f(z) E Agj—1,n41 or f(x) E Aoj.n41- In the former case, 
fn+i() = fn(x), while, in the latter case, fn4i(2) = f(x) + 27+. Thus 
(fi, fo,...) is increasing. 

Now we derive (6.1.1) for f,g nonnegative, simple, and measurable. 

We begin with a single nonnegative, simple f with (6.1.2) in canonical 
form and A,,...,Ay measurable. Let G be the subgraph of f and G; the 
subgraph of ajl4,, j = 1,...,N. Then the sets G; are measurable in R’, 
disjoint, and Gi U...UGy = G; hence, by additivity of area (Exercise 4.5.13) 
and (vertical) dilation invariance, 


: f(a) dx = area (G) 


= area(G)) +---+area(Gy) 
b 


b 
-| ada (0) det s+ f anla4, (x) dx 


b b 
=a, [ La(a)de+---bay f 14, (2) da. 


2 This is a dyadic decomposition of the range of f. 
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Now let f,g be nonnegative and simple with 
f=) wx, g= >> bs 1B;, 

a J 
both in canonical form with (A;), (B;) measurable. Then 
f= S- ala snB;, g= S- bj La;nB;> 

tJ tJ 

are in canonical form, with (A; B;) measurable, and so is 
ftg= > (ai +;)Lang,- 
ij 


By what we just derived (applied three times) 


b b 
[te + 9@ ae = Dad) [ Aacon, 2) ae 
oJ : 


b 
=Ya | Lain, (0) de +98; f LajnB, (2) dx 
a,j a ag 


= [sears f oeae. 


This establishes (6.1.1) for f,g nonnegative, simple, and measurable. 

If f,g are nonnegative and measurable, select sequences (fn), (gn) of non- 
negative, simple, measurable functions with f, 7 f and gn 7 g pointwise as 
n — oo. By the monotone convergence theorem (§4.5) applied three times, 


[fn (x) + gn(a)] dar 


b 
noo Ja 


b 
| ise = te 
b b 
dim, f fale) de + tim f gu(a) ade 


[t@ de f ole)ae 


This establishes (6.1.1) for f,g nonnegative and measurable. 


Theorem 6.1.5. Let f and g be measurable. If f and g are both nonnegative 
or both integrable on (a,b), then (6.1.1) holds. 


Above we derived the nonnegative case. Let f,g be nonnegative, measur- 
able, and integrable, and let A = {a : f(x) > g(x)} C (a,b), fi = fla, 
fo = flac, g: = gla, g2 = glac. Then A is measurable (Exercise 6.1.3), 
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fith=f,n+g=9, fi = 1, 92 = fo, and (f—-gt =fA-an, 
(f — g)~ = g2 — fe. By the nonnegative case applied four times, 


b b b 
/ fo="o= : (F(a) — g(a)|* de — i LF(@) — g(@)|- de 

a a Me 
= y [f1(0) — gu(2)] de — / [oa (#) — fale) dx 


_ ([ rav—[mac)—-(f mar— | frac) 
=f steyae— f ofayae 


This establishes (6.1.1) with f,g measurable and integrable and f > 0 and 
g <0. 
Finally, let f,g be measurable and integrable. By what we just learned, 


This establishes (6.1.1) when f, g are measurable and integrable. 
Since sums of measurable functions are measurable, by induction one has 
linearity for finite sums 


o{ N N 
/ dil) dx = S~ 


ja“ @ 
when the measurable functions f;,..., fj are all integrable or all non- 
negative. 
Exercises 


6.1.1. Show that f is simple iff (6.1.2) holds. Moreover, (6.1.2) can be taken 
in canonical form. 


6.1.2. Let \ > 0, and suppose a bijection f : R? > R? satisfies area (f(A)) = 
Aarea(A) for all A C R?. Show that M Cc R? measurable implies f(M) is 
measurable. In particular, this is so for any dilate or translate of M. 
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6.1.3. If f, g are measurable on (a,b), then A = {a : f(x) > g(x)} is measur- 
able. (What is the connection between A and A, = {x: f(x) > r > g(a)}, 
r € Q?) 


6.1.4. If f is measurable, so is —f. 


6.1.5. If f,g are measurable, then f + g is measurable. (What is the connec- 
tion between A = {x : f(a) + g(a) < M} and A, = {x: f(x) <r}, rE Q, 
and A, = {x : g(x) < s}, s € Q?) 


6.1.6. If f,g are measurable, then fg is measurable. Start with f,g non- 
negative. (What is the connection between A = {x : f(x)g(a) < M} and 
A, = {u: f(x) <r}, r€Q, and A, = {x : g(x) < s}, s € Q?) 


6.2 Limit Theorems 


In this section we present the limit theorems of Chapter 5 in the broader mea- 
surable setting. The proofs are exactly as before, the only change being the 
use of linearity for measurable integrands Theorem 6.1.5 instead of linearity 
for continuous integrands Theorem 4.4.5. 

There is no need to restate the monotone convergence theorem (Theo- 
rem 5.1.2) or Fatou’s lemma (Exercise 5.1.2) as they are valid for arbitrary 
functions. 


Theorem 6.2.1 (Summation Under the Integral Sign—Positive 
Case). Let fr, n > 1, be a sequence of nonnegative functions on (a,b). 
If fn, n> 1, are measurable, then 


[[Semco] a= 3 [plore 


n=1"4 


Next is the dominated convergence theorem. 


Theorem 6.2.2 (Dominated Convergence Theorem). Let f,, n > 1, 
be a sequence of functions on (a,b). Suppose there is a function g integrable 
on (a,b) satisfying |fn(x)| < g(x) for all x in (a,b) and alln > 1. If 


Jim fale) = fa), a<a<b, 


then f and fn, n > 1, are integrable on (a,b). If g and fn, n > 1, are 
measurable, then 


lim [ fota)de = f° tim iuladde = [Fe de. (6.2.1) 


noo 


Following are the consequences of the dominated convergence theorem. 


284 6 Generalizations 


Theorem 6.2.3 (Summation Under the Integral Sign — Alternating 
Case). Let fn, n > 1, be a decreasing sequence of nonnegative functions 
on (a,b), and suppose that fi is integrable on (a,b). Then fr, n > 1, and 
So (-1)" "fn are integrable on (a,b). If fn, n> 1, are measurable, then 


bf 2 oo b 
/ onc dz = y(-1y* | blade. 


n=1 


Theorem 6.2.4 (Summation Under the Integral Sign—Absolute 
Case). Let fr, > 1, be a sequence of functions on (a,b), and suppose that 
there is a function g integrable on (a,b) and satisfying \~°—_, |fn(x)| < g(x) 
for all x in (a,b). Then fr, n> 1, and >”, fn are integrable on (a,b). If g 
and fn, nm > 1, are measurable, then 


_ pera) =f Inte ie 


Theorem 6.2.5 (Continuity Under the Integral Sign). Let f(x,t) be a 
function of two variables, defined on (a,b) x (c,d), such that f(-,t) is con- 
tinuous for all c < t < d. Suppose there is a function g integrable on (c, d) 
satisfying |f(a,t)| < g(t) fora < « < bandc <t < d. Then f(z,-) is 
integrable on (c,d) fora<« <b and 


d 
F(o)= | f(a, t) dt, a<az<b, 


is well defined. If g and f(x,-), a< a < b, are measurable, then F is contin- 
uous. 


Last is differentiation under the integral sign. 


Theorem 6.2.6 (Differentiation Under the Integral Sign). Let f (x,t) 
be a function of two variables, defined on (a,b) x (c,d), such that 


OGG a<u<be<t<d, 
Ox 


exists. Suppose there is a function g integrable on (c,d), such that 


seepl+|Sca.0] <a, a<a<b,c<t<d. 


Then f(x,-) and Of /Ox(x,-) are integrable for alla <x < b, and 


d 
P(o)= | f(a, t) dt, a<a<b, 
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is well defined. If g and f(x,-),a<a <b, are measurable, then F is differ- 
entiable on (a,b) and 
=f at 


Ba) = By (tt) at, a<e< b. 


c 


A set U C R is open if for every c € U, there is an open interval I 
containing c and contained in U. Clearly, an open interval is an open set. 
Conversely, by Exercise 6.2.1, every open set U C R is a finite or countable 
disjoint union of open intervals. These intervals are the component intervals 
of U. 


Exercises 


6.2.1. Show every open set U C R is a finite or countable disjoint union of 
open intervals. (Given x € U, look at the largest open interval (a,,b,) C U 
containing 2.) 


6.2.2. F is continuous on {a, }] iff for every open set U C R, F~'(U) is an 
open set. 


6.2.3. If F is continuous on [a,b] and U C (a, 6) is an open set, then F(U) is 
measurable. 


6.2.4. If U is a countable disjoint union of open intervals I, = (ck, dx), k > 1, 
and f is arbitrary nonnegative on R, then (4.3.5) 


fore) co dy 
/ 1y(0)f(a)ae = > f Oto (6.2.2) 


6.2.5. Let fr, n > 1, be measurable on (a,b), and let 
A={x: lim f,(x) exists}. 
noo 


Show A is measurable. If f equals the limit on A and zero off A, then f is 
measurable. 


6.3 The Fundamental Theorems of Calculus 


Recall (§3.7) F' is a primitive of f on (a, 6) if F’(x) = f(x) for all x in (a,b). 
We previously established the following: 
e The first fundamental theorem of calculus. 


e Any two primitives of a continuous function differ by a constant. 
e The second fundamental theorem of calculus. 
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In this section, we give an overview of the generalizations of these results. 
The proofs will occupy the rest of the chapter. 

We begin with a simple example to illustrate what can happen when f is 
not continuous. Let f = 1(9,1), and let 


a , & <0, 
F(a)= | (apa dn, OSe eI 
=ye 1, r>1 


oI — 


Thus F’(x) exists and equals f(a) for all x real except x = 0 and x = 1. 

We say f is locally integrable on (a,b) if f is integrable on (c,d) for all 
[c,d] C (a,b). Every continuous function is locally integrable. Let f be locally 
integrable arbitrary on (a,b), fix c in (a,b), and let 


F(x) =f soa a<xn<b. 


A real x in (a,b) is a Lebesgue point of f if F’(x) exists and equals f(z). 
For any other c’ in (a,b), F.— Fe is a constant (Theorem 4.3.3). Thus F” (x) 
exists iff F’ (a) exists, in which case they are equal. 

If f is continuous, we know every point x in (a,b) is a Lebesgue point, 
but the above example shows that some points may fail to be Lebesgue in 
general. Thus the best one can expect is that the non-Lebesgue points form 
a negligible set in (a,b). This is made precise as follows. 

Define the length? of A c R by 


length (A) = area(A x (0,1)). 


Then length (J) = 6 — a for any interval I = (a,b) (Theorem 4.2.1). More 
generally (Exercise 6.2.4), for U C R open, length (U) equals the sum of the 
lengths of its component intervals. 

A set N C (a,b) is negligible if length (N) = 0. Given f,g on (a,b) and 
A C (a,b), we say f(x) = g(x) for almost all « € A, or f = g almost 
everywhere on A, if the set N = {2 € A: f(x) £ g(x)} is negligible. 

That this is a reasonable interpretation of “negligible” is supported by: if 
f = g almost everywhere on (a,b), then f is measurable iff g is measurable 
(Exercise 6.3.4). 

For AC (a,b) and f on (a,b), define 


I f(x) de = | “ Lala) fle) de 


whenever 14f is nonnegative or integrable. Clearly this depends only on the 
restriction of f to A. Then for arbitrary f and N negligible, 


3 This is one-dimensional Lebesgue measure (Exercise 6.3.5). 
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| f(x) dx = 0. (6.3.1) 
N 


Indeed, by the method of exhaustion and dilation invariance, 


area (N x (0,00)) = lim area (NV x (0,n)) = lim n-area(N x (0,1)) =0. 


But for f > 0 the subgraph of 1) f is contained in N x (0,00) and (1y f)* = 
1y f*. Thus (6.3.1) follows for arbitrary f. 

Now suppose f is defined almost everywhere on (a,b), i.e., suppose f is 
defined only on (a,b) \ N’ for some N’ C (a,b) negligible. If we set 


b 
z)dz= x) dx, 6.3.2 
[teas] fe) (6.3.2) 


then (6.3.1) is valid for any negligible NV. 
More generally, let f and g be arbitrary, defined almost everywhere on 
(a,b) and f = g almost everywhere on (a,b). Then 


[soa [owas 


in the sense that the left side exists iff the right side exists, in which case both 
sides agree. Indeed, let {f = g} denote the set in (a,b) where the functions 
are both defined and agree, and let N = (a,b)\{f = g}. Then N is negligible, 
and for f > 0 and g > 0 (here we use monotonicity and subadditivity of area), 


[teres frmace | seraes f seeyae 


hence, 


[is =f se) a= fo = fo) dx. (6.3.3) 


The general case follows by applying this to f* and g*. In particular, (6.3.3) 
implies the integral as defined by (6.3.2) does not depend on N’. 

The following establishes that almost every x in (a,b) is a Lebesgue point 
of f. This is the Lebesgue differentiation theorem. 


Theorem 6.3.1 (First Fundamental Theorem of Calculus). Let f be 
locally integrable arbitrary on (a,b), fix c in (a,b), and let 


F(a) = [ f(t) dt, a<a<b. (6.3.4) 
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Then 
Fi(x) = f(z) for almost all x in (a, 6) (6.3.5) 


iff f is measurable. 


Note existence of F(a) for almost all x is part of the claim. This is a direct 
generalization of Theorem 4.4.2 and is established in §6.6. 

We now show that it is possible to have F’ = 0 almost everywhere for 
nonconstant F’. 

To construct such an example, we use the one-dimensional version of the 
Cantor set (§4.1). The Cantor set C C [0,1] is the set of reals of the form 


t=) gn dyn = 0 or 2,n>1. (6.3.6) 


so x ¢ C iff for some N > 1, dy = 1 and d,, n > N, are not all equal. Thus, 
x ¢ C iff for some N > 1 


with d,, n > N, not all equal. But this is the same as x belonging to one of 
the 2—! component intervals 


vee Ee (3.7) 


for some N > 1. Note z is one of the endpoints iff for some N > 1,dn,n > N, 
are all equal. Thus C% is the union of (1/3, 2/3),(1/9, 2/9),(7/9, 8/9),...and 


length (C°) > ay So 
nN —! anh i 5) —— 
cas 3 3 1- a 


by Exercise 6.2.4, which implies length (C’) = length ([0, 1])—length (C°) = 0; 
thus, C' is negligible. Another way to measure C is to note that the dilate 3C 
equals the disjoint union of C' and its translate C + 2; hence,* 

3 length (C) = length (8C) = length (C U (C + 2)) = 2length (C), 
Thus, C is negligible. 


+ Interpreted appropriately, this shows a = log; 2 is the Hausdorff dimension of C. 
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For x in C given by (6.3.6), define 


Then F(c) = F(d) whenever (c,d) is of the form (6.3.7), so we extend the 
definition of F’ to all of [0,1] by defining F' to be constant on each interval 
[c,d]. Then the Cantor function F is increasing and holder continuous with 
exponent a = logs 2 (Exercise 6.3.9) on [0,1]. Since F(0) = 0, F(1) = 1, 
and F’(z) = 0 on C°, F is an example of a nonconstant function satisfy- 
ing F’(x) = 0 almost everywhere. See [16] for an interactive demonstration 
(Figure 6.1). 


Fig. 6.1 The Cantor function 


Thus functions are not determined by their derivatives without additional 
restrictions. 

Let F' be defined on [a, b]. We say F' is absolutely continuous on |a, b] (this 
is due to Vitali [14]) if for any « > 0, there is a 6 > 0 such that for any 
disjoint open intervals I, = (cx, dx), 1 << k < N, in (a,)), 


2 


N 
(dy —ce) <6 implies S- |F(d.) — F(ck)| <. 
k=1 


> 
Il 
ma 


More generally, we say F' is locally absolutely continuous on (a,b) if F is 
absolutely continuous on [c, d] for all [c,d] C (a, b). 

Recall a primitive of f on (a,b) was defined in §3.7 to be a differentiable 
function F' satisfying F’(a) = f(a) for all a in (a,b). We now define a prim- 
itive of f on (a,b) to be a locally absolutely continuous function F' on (a, b) 
satisfying F’(x) = f(a) almost everywhere on (a, b). 

Then we have the following results for primitives to be established in §6.5. 
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Theorem 6.3.2. Let f be continuous on (a,b). Then F’(x) = f(x) for all x 
in (a,b) iff F is locally absolutely continuous on (a,b) and F'(x) = f(x) for 
almost all x in (a,b). 


Thus the definition of primitive in §3.7 and the definition above are con- 
sistent for continuous functions. In what follows, we use primitive as above. 


Theorem 6.3.3. A function f has a primitive F on (a,b) iff f is measurable 
and locally integrable on (a,b), in which case F is given by (6.3.4). 


Theorem 6.3.4. A function F is a primitive on (a,b) of some f on (a,b) iff 
F is locally absolutely continuous on (a,b), in which case f is given by F’. 


Theorem 6.3.5. Any two primitives of a locally integrable measurable f on 
(a,b) differ by a constant. 


The following is a consequence of the above results exactly as in §4.4. 


Theorem 6.3.6 (Second Fundamental Theorem of Calculus). Let f 
be nonnegative or integrable on (a,b) and suppose f has a primitive F on 
(a,b). Then F(b—) and F(a+) exist, and 


b 
/ f(x) dz = F(b—) — F(a+). 


Note while the Cantor function is uniformly continuous on [0,1] (Exer- 
cise 6.3.9), it cannot be locally absolutely continuous on (0,1), since it would 
then be a primitive of f(#) = 0, contradicting four of the last five theorems. 

Let AC R. Since length (A) = area(A x (0,1)), by Exercise 4.5.6, given 
€ > 0, there is an open G C R? containing Ax (0,1) and satisfying area (G) < 
length (A) +. The following shows we may choose G = U x (0,1). This will 
be useful frequently below. 


Theorem 6.3.7. For A C R arbitrary and € > 0, there is an open set U CR 
with ACU and 


length (A) < length (U) < length (A) + e. (6.3.8) 


Choose 6 > 0 small enough to satisfy 
1 
1 — 26 


Since length (A) = area(A x (0,1)), by definition of area, there is (step 2 in 
the proof of Theorem 4.2.1) an open paving (Q,,) of A x (0,1) satisfying 


length (A) + i : 5 < length (A) +. 


S~ |]Qnl| < length (A) + 6. 
n=1 


Let x € A. 
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Since {x} x [6,1 — 6] is a compact rectangle, there is a finite subset S of 
N satisfying 


{x} x [51-8] Cc UJ Qn. (6.3.9) 
nes 


By discarding the rectangles Q,,, n € S that do not intersect {x} x [6,1 — 4], 
we may assume 


({x} x [5,1-6])N Qn #9, for every n € S. (6.3.10) 


Given any finite subset S of N, let Ag be the set of x € A satisfying (6.3.9) 
and (6.3.10). By the above, we conclude A = U{As : S C N}. Now with 
Qn = In x dns n = 1, let 


Ig=(In:neS}, Js=|J{J,:neS}, U=|J{Is: As £9}. 


Then Ag C Ig. Moreover, if Ag is nonempty, then [6,1 — 6] C Js; hence, 
Is x [6,1-6] C U{Qn: n € N}. Thus A Cc U, U is open, and U x [6,1—6] 
Ure, Qn. Hence length (A) < length (U) and 


(1 — 26) length (U) = area (U x [5,1 — 6]) < S> ||Qnl| < length (A) + 6. 


n=1 


Dividing by 1 — 26, the result follows. 


Exercises 


6.3.1. A negligible set is measurable. 


6.3.2. A subset of a negligible set is negligible, and a countable union of 
negligible sets is negligible. 


6.3.3. For A,B CR, the symmetric difference is |A—B| = (AUB)\(ANMB). 
If |A — B| is negligible, then A is measurable iff B is measurable. 


6.3.4. Suppose f = g almost everywhere on (a,b). If f is measurable on 
(a,b), so is g. 


6.3.5. Given A C R, a paving of A is a sequence of intervals (J,,) satisfying 
ACU, Jn. For any interval I with endpoints a, }, let ||I|| = b — a. Show 


n=1 


length (A) = inf > \|Zn|| : all pavings (I,,) of a ' 


n=1 
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6.3.6. Let 0<a<1.If ACR satisfies length (AN (a,b)) < a(b— a) for all 
a <b, then A is negligible (compare with Exercise 4.2.15). 


6.3.7. Let Fo(2) = x, 0 < a < 1, and define F,, on [0,1], n > 1, recursively 
by 
F,,(32), 0 
2Fn41(2) = al z 
F,, (3a — 2) +1, 2 
Show F,,, n > 0 is increasing, piecewise linear, F; 
0 < F' (ax) < (8/2)” almost everywhere. 
6.3.8. With F,,, n > 0, as in the previous exercise, let 


— — > % 
En rece [Fn+i(z) —Fr(x)|, n>0 


Show that en41 < en/2, n > 0, and use this to show (F;,(x)) is Cauchy. If 
F(z) =limp+0 Fn(x), 0< a <1, show F, > F uniformly and 


F (32), 
2F (x) = 41, 
F (3x —2)+1, 


IA 1A IA 
cops 1 


x 
aw 
aw 


wily wl © 
IA lA IA 


Conclude F is the Cantor function. 


6.3.9. Let a = log, 2. The Cantor function is increasing and continuous and 
satisfies 


0 < F(z) — F(a) < F(z-3), (2/2)° < F(a) < «® O<a<z<l. 
Conclude 
0 < F(z) — F(a) < (z-2)°, O<a<z<l. 


(For (2/2)* < F(x), use )>x% > (> a,)*. For the others, first prove by 
induction with F,,, n > 0, replacing F’, and then take the limit.) 


6.4 The Sunrise Lemma 


Let G be continuous on [a,b]. Call an interval (c,d) C (a,b) balanced if 
G(c) = G(d). The following result, due to Riesz [12], shows that [a,b] may be 
decomposed into balanced intervals on whose complement G is decreasing. 

Recall (§6.2) every open set U C R is a finite or countable disjoint union 
of open intervals, the component intervals of U. 
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Theorem 6.4.1 (Sunrise Lemma). Let G be continuous on [a,b], and let 
Ug = {c€ (a,b) : G(x) > G(c) for some x > c}. 


Then Ug is open and G(c) < G(d) for each component interval (c,d) of Ug 
(Figure 6.2). 

Moreover, if G(a) > G(b), there is an open set We C (a,b) such that each 
component interval of Wg is balanced and G(c) > G(x) fora<c¢ We and 
LSC. 


a 


Fig. 6.2 The Sunrise Lemma 


To understand the intuition behind the name and the statement, think of 
the graph of G as a mountainous region with the sun rising at oo from the 
right. Then Ug is the portion of the x-axis where the graph is in shadow. See 
[17] for an interactive demonstration. 

By continuity of G, Ug is open. Let (c, d) be one of the component intervals, 
and suppose G(c) > G(d). Let e be the largest real in [c, d] satisfying G(e) = 
(G(c) +G(d))/2. Then G(d) < G(e) < G(c) and c < e < d; hence, there exists 
x > e satisfying G(x) > G(e). If a > d, then d € Ug, contradicting d ¢ Ug. 
If « < d, then G(x) > G(e) > G(d), so by the intermediate value property, 
there exists f € (x,d) satisfying G(f) = G(e), contradicting the definition 
of e. We conclude G(c) < G(d). Also by definition of Ug, if a < c ¢ Ug, we 
have G(x) < G(c) for x > c. This establishes the first part. 

For the second part, let a’ be the largest real° in [a, b] satisfying G(@’)=G(a). 
We claim G(a’) = maxjq) G. If not, there is a ¢ > a’ with G(c) > G(a). 


5 The proof is valid if a’ = a or a’ = b. 
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By the intermediate value property, there is a d € (c,b] with G(d) = G(a), 
contradicting the choice of a’. Thus Ga’) = maxjq »} G. Now apply the first 
part to G on [a’, b| yielding UZ C (a’,b). By definition of U4, for each com- 
ponent interval (c,d) of UG, either G(c) > G(d) or c = a’. If c = a’, then 
G(c) = maxjq4) G, so G(c) => G(d). Hence G(c) = G(d) for each component 
interval (c,d) of UG. Now set We = (a,a’) UUG. If a<c ¢ We, then c= a’ 
or a’ <c ¢ UG. In either case, x > c implies G(x) < G(c). 
Here is our first application of the sunrise lemma. 


Theorem 6.4.2 (Lebesgue Density Theorem). Let AC R be arbitrary. 
Then 


length (AN (c, x)) 


lim =, for almost allc € A (6.4.1) 
u—c+ w—C 
length (A 
lim ne A) = 1, for almost allc € A. (6.4.2) 
@2—>c— c-2z 


Applying (6.4.1) to —A yields (6.4.2); hence, it is enough to derive (6.4.1). 
Without loss of generality, we may also assume A is bounded, by replacing 
A by AN (=n, n), since 


AN (c,2) = AN(=n,n)N(c,2), Ic] <n, |r| <n. 


Let N be the complement in A of the set of c satisfying (6.4.1). We show N 
is negligible. 

Since the ratio in (6.4.1) is < 1, the limit in (6.4.1) fails to exist at c iff 
(§2.2) at least one right limit point is strictly less than 1. But this happens 
iff there is 7; + c+ with 


length (AN (c, 2;)) 


Lj —C 


<a, joi, 


for some 0 < a < 1. It follows that N is the union of Na, 0 < a < 1, where 
Na = {c€ A: Ga(z;) > Ga(c) for some sequence x; — c+} 


and 
Ga(x) = ax — length (AN (—ov, z)). 


Hence it is enough to show Ng is negligible for each 0 < a < 1. 
For a < b, apply the first part of the Sunrise Lemma to G, on [a,b] 
obtaining Ug, C (a,b). Then NaN (a,b) C Ug, and Ga(d) > Ga(c); hence, 


length (AN (c,d)) < a(d—c), 


for each component interval (c,d) of Ug. By Exercise 6.2.4 and Nq C A, 
summing over the component intervals (c,d) of Ug. 
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length (NaN (a, b)) < S- length (NaN (c, d)) 
(c,d) 


< > length (AN (c, d)) 
(c,d) 


< S¢ a(d—c) <a(b—a). 
(c,d) 


By Exercise 6.3.6, this implies Nq is negligible. 
For c € R, define 


D.F(x) = oC aa aXe. (6.4.3) 


xw—C 


Note F increasing implies D. F(x) > 0 for « # c. A function F is Lipschitz 
on [a,b] with constant A > 0 if 


|F (x) _ F(2')| < A|x — |, Cd € [a,b]. 


Equivalently, F is Lipschitz with constant if |D.F(xz)| <A,a<c<a<b. 
We say F has constant slope \ on (a,b) if D-F(a) =A,a<c<a<b. 


Theorem 6.4.3 (Lipschitz Approximation). Let F' be continuous on 
[a,b] and let Ay = DaF(b). Then for X > Xo, there is an open set Wy C 
(a,b) and Fy continuous on [a,b] such that Fy agrees with F on Wx, Fy 
has constant slope X on each component interval of Wy, and D-Fy(x) < X 
fora<c<a<b. If < X and D.F(x) > pw fora<c< a <b, then 
D.Fy(2) > py fora<c<a<b. 


If G(a) = F(x) — Ax, then G(a) > G(b). Applying the second part of the 
Sunrise Lemma, F'(d) — Ad = F(c) — Ac for each component interval (c, d) of 
We. Given z in [a,b], let x = Z=vz if xe ¢ We, and let r=candz=dif« 
is in a component interval (c,d). Define F by setting 


Fy(a) — Ax = F(x) — Au = F(Z) — AZ, a<a<b, 


and W), = Wea. Then F) has constant slope \ on each component interval. 
Exercise 6.4.1 shows F(x) — Ax is continuous and decreasing on [a, b]; hence, 
D.F)(a2) < A fora<c< a <b. Now assume D.F (x) > pw fora<c<a<b, 
and let c < x. If cand « are in the same component interval, then D.F)(x) = 
A > yw. If not, then ¢ < x and 


D.F\(@)=A> 4, DeFy(z) = DeF(z) > uw, DeFy(z) = A= p, 


which together imply D.F)(a) > w. 
Let f > 0 be measurable and integrable over (a,b). For A C (a,b), let 


1 
length (A) [ eyes 
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denote the average of f over A. By the Lebesgue differentiation theorem,° 


F(a) = [flea a<az<b, 


implies F’(x) = f(x) almost everywhere on (a,b). Thus from Exercise 6.4.3, 
we obtain the following. 


Theorem 6.4.4." Let f > 0 be measurable and integrable over (a,b), and 
let Xo be the average of f over (a,b). Then for all X > Xo, there is an open 
set Wy C (a,b) such that the average of f over each component interval of 
W) equals X, and0 < f < A almost everywhere on WX. 


Exercises 


6.4.1. With G as in the sunrise lemma, suppose G(a) > G(b). Define G to 
be constant on [c,d] for each component interval (c,d) of We, and G = G 
elsewhere. Then G is continuous and decreasing on |a, }]. 


6.4.2. Given f > 0 integrable arbitrary on R and A > 0, let 


and {f* > A} = {x : f*(x) > A}. Show that {f* > A} is the union over 
n > 1 of Ug, C (—n,n) with G,(x) = f%,, f(t) dt — Ax on [—n,n]. Use this 
to derive the Hardy-Littlewood maximal inequality 


A length ({ f* > A}) ai f(a) da. 
{f*>A} 
6.4.3. Let F be continuous increasing on [a,b] and let A» = DF (b) > 0. 
Assume F” exists almost everywhere on (a,b). Then for all A > Apo, there is 
an open set W) C (a,b) such that 0 < F’(c) < A almost everywhere on Wy 
and D.F(d) = A on each component interval (c,d) of W. 


6.4.4. With G as in the Sunrise Lemma, assume G(a) > G(b). If (c,d) is a 
component interval of We with a < c, then G(c) > G(x) for c< « € We. 


6 Established in 86.6 
” This is a precursor of the Calderén-Zygmund lemma. 
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6.5 Absolute Continuity 


Let F be defined on [a,b]. Given an open set U C (a,b), the variation of F 
over U is 


var(F,U) ee 


Here the sum is over the component intervals (c,d) of U; thus, the number 
of terms in the sum may be finite or infinite. Given U C (a,b), the total 
variation up(U) is 


ur(U) = sup{var(F,U’):U’ CU open}. 


When U = (a,b), we write vr(a, b) instead of ur(U). We say F has bounded 
variation on [a,b] if ur(a,b) < oo. Note ur(U) < vr(U’) when U CU’. 

When F is increasing on [a,b], every variation in (a,b) is no greater than 
F(b) — F(a), and in fact vp(a,b) = F(b) — F(a). By Exercise 6.5.1, for F 
increasing this implies 


vp(U) = var(F,U) = > F(d 
(c,d) 
whenever U Cc (a,b) and the sum is over the component intervals of U. 
When f is integrable over (a,b) and F' = F;, is given by 
F(a) = F.(«) = / f(t) dt, a<a<b, (6.5.1) 


and U’ C U C (a,b), summing over the component intervals (c, d) of U’ yields 


var(F, U") = |F'(d) 
(c,d) 


<> fie abe jaz < f f(a) ae. 


(c,d) 


Taking the sup over U’ C U yields 


v)< f elas (6.5.2) 


Let (U,) be a sequence of open sets in (a,b). We say F is absolutely con- 
tinuous on |a, b] if length (U;,) > 0 as n > co implies vr(U,) > 0asn > co. 
Exercise 6.5.2 shows that the definitions of absolute continuity in §6.3 and 
here are equivalent. 

Note absolute continuity on [a, b] implies uniform continuity (§2.3) on [a, d]. 
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More generally, we say F is locally absolutely continuous on (a,b) if F is 
absolutely continuous on [c, d] for all [c,d] C (a, b). 

If F is absolutely continuous on [a,b], then (Exercise 6.5.3) ur(a,b) < 00 
and vupr(a,b) equals the total variation as defined in Exercise 2.2.4. Thus 
absolute continuity implies bounded variation. The converse is not true as 
the Cantor function is bounded variation but not absolutely continuous. 


Theorem 6.5.1. If F is continuous on [a,b] and bounded variation on [a, 6], 
then there are continuous increasing functions G, H on [a,b] with F = G—H. 
If F is absolutely continuous on [a,b], then G and H may also be chosen 
absolutely continuous on |a, 0]. 


By Exercise 6.5.3, ur(a,b) < oo. By Exercise 2.2.6, G(x) = vup(a,2), 
H(x) = vp(a,x) — F(x),a< ax <b, are increasing and continuous. By Exer- 
cise 6.5.4, ug(U) = vr(U) for U C (a,b) open. Thus F absolutely continuous 
implies G absolutely continuous. Since F,, F2 absolutely continuous implies® 
F, + F2 is absolutely continuous, H is also absolutely continuous. 

The goal of this section is to derive Theorems 6.3.3, 6.3.4, 6.3.5, and 6.3.2. 
Here is the main result of this section. 


Theorem 6.5.2. If F is absolutely continuous on |a,b], then F’(x) exists for 
almost all x in (a,b), F’ is integrable on (a,b), and 


b 
F(b) — F(a) = / F’(a) de. (6.5.3) 


In fact, Exercise 6.6.4 shows 


b 
ve(a,b) = f |F’(x)| dex. (6.5.4) 


This says if F is a primitive of f, then vp is a primitive of |f|. Note (6.5.4) 
together with Exercise 6.5.1 shows that (6.5.2) is in fact an equality. 

The basic connection between integrals and absolutely continuous func- 
tions is the following. 


Theorem 6.5.3. Let f be integrable and measurable on (a,b). If (An) is a 
sequence of arbitrary subsets of (a,b) with length (A,) > 0 as n > oo, then 


i f(x) dx > 0, n — OO. 
An 


By decomposing f = ft — f~, we may assume f > 0. By Theorem 6.3.7, 
for each n > 1, there is open U, D Ay with length(U,) — 0. By in- 
tersecting with (a,b), we may assume U, C (a,b), n > 1. It is enough 
to show Te f(x)dx — 0. We now argue by contradiction and suppose 


8 This is most easily seen via the definition in §6.3. 
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ae) xu)dx A : By passing to a subsequence, we can assume there is an 
e>0 a le? x) dx >, n> 1. By passing to a further subsequence, we 
may assume un I an < 27", n> 1. Let Vy be the union of U;, 7 >n+1. 
Then 


length (V, 2 length (U <y 2 a2. ay Sl 
j=ant+1 j=nt+1 


and 


[ foez | searr oe 


Let B denote the intersection of V,,, n > 1. Then length (B) < length (V,) < 
2-", n > 1; hence, length (B) = 0. But 1y, f decreases pointwise to lpf as 
n— oo, withO<1y,f < f,n>1. By the dominated convergence theorem, 


o= | fa)ax- lim f(a) dx > «, 


n—->co V, 
n 


a contradiction. 

By (6.5.2), an immediate consequence of this is the local absolute continuity 
of F’ given by (6.5.1). 

We next tackle the differentiability of functions. Since F’(c) is the limit 
of D-F at c, the existence of F’(c) (§3.1) is equivalent (§2.2) to the equality 
and finiteness of four quantities, the upper and lower limits of D.F (x) as 
x — c+. These are the four Dini derivatives of F at c. Equivalently, the four 
Dini derivatives are equal iff all right and left limit points of D.F at c are 
equal (Exercise 1.5.10 and Theorem 2.2.1). 

The next two results yield estimates corresponding to two of the four Dini 
derivatives. 


Theorem 6.5.4. Let F be continuous on [a,b], let X > 0, and let 
Uy = {c € (a,b): D-F (x) > X for some x > c}. 
Then 
A length (Uy) < up(Uy) < ur(a, b). (6.5.5) 


To derive this, apply the first part of the sunrise lemma to G(x) = F(x) — 
Ax on [a,b]. Then U, = Ug, and G(d) > G(c) for each component interval 
(c,d) of Uy, which implies F'(d)— F(c) > A(d—c) for each component interval 
(c,d) of Uy. Hence 


Alength (Ux) = $> A(d-c) < $0 (F(d) - F(o)) < vr(U)). 
(c,d) (c,d) 


For increasing functions, we have a complementary result. 
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Theorem 6.5.5. Let F be continuous increasing on [a,b], let X > 0, and let 
Vy = {c€ (a,b): D-F (x) < X for some x <c}. 


Then 
up(Vy) < Alength (V)) < A length ({a, b]) . (6.5.6 


— 


To derive this, apply the first part of the Sunrise Lemma to G(z) = 
F(—x) + Ax on —[a,b] = [—b, —a]. Then V. = —Ug, and G(—c) > G(-d 
for each component interval —(c,d) = (—d,—c) of Ug, which implies F'(d) — 
F(c) < X(d—c) for each component interval (c,d) of V,. Hence 


vr(Va) = 90 (F(@) — F()) <A> (dc) = Alength (Vy). 
(c,d) (c,d) 


Theorem 6.5.6. If F is continuous and bounded variation on R, then F’ (x) 
exists for almost all x in R. 


Since F' is the difference of two continuous increasing functions, we may 
assume F' is continuous increasing. We first show each limit point of D.F at 
c is finite, for almost all c € R. Let [a,b] CR. 

Since F' is bounded variation, vr(a,b) < oo. Let U, = Uy as in Theo- 
rem 6.5.4 on [a,b] with A = n. Since nlength(U;,) < vr(a,b) for n > 1, 
length (U,) + 0 as n — oo; hence, 


length a v.) = lim length (U,,) = 0. 
noo 


n=1 


If a right limit point of D.F at c € (a,b) equals oo, then c € (\7_, Un. We 
conclude right limit points of D.F at c are finite for almost all c. Similarly 
for left limit points. 

Now all limit points of D.F at c are equal iff all left limit points of D.F 
at c are not less than all right limit points of D-F at c and all right limit 
points of D.F at c are not less than all left limit points of D.F at c. 

Let F(x) = —F(—2). Since D.F (a) = D_.F(—«), the second assertion 
follows from the first. Thus, it is enough to establish for almost all c; all left 
limit points of D.F at c are not less than all right limit points of D.F at c. 

To this end, for 0 < m <n rational, define 


E,, ={ceR: there is a left limit point of D.F less than m}, 
Et ={ceR: there is a right limit point of D.F greater than n}, 


and 
N=) fon= |) 208 


m<n m<n 
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Let L be a left limit point of D.F at c, and let L’ be a right limit point 
of D.F at c. Then L < L’ iff there are rationals 0 < m < n satisfying 
L<m<n< L’, which happens iff c € N. Thus, it is enough to show Em.n 
is negligible for al 0 <m<_n. 

Given a compact interval [a, 6], apply Theorem 6.5.5 on [a,b] with A = m 
to get Vm C (a,b) with 


UF(Vm) < mlength (Vin) < m length ([a, b]) . 


If (c,d) is a component interval of V,,, apply Theorem 6.5.4 on [c, d] with 
A =n to get Un» C (c,d) with 


nlength (Um n) < ur(Umn) < vr(c, d). 
Since Emin (c,d) C Um n, it follows that 
nlength (EmnM(c,d)) < up(c, d). 


Since Ein M (a,b) = Emin Vm, summing over all component intervals of 
Vim yields 


niength (Ein A (a, 6)) < ur(Vin) < mlength (Vin) < m(b— a) 


hence, 


length (Em (a, b)) < rth —a), a<b. 
n 


Exercise 6.3.6 now implies E,,,, is negligible. 
Let F' be continuous increasing on [a, b], and define F(x) = F'(b) for > b 
and F(a) = F(a) for x < a. Then F is continuous increasing on R; hence, 
F'(x) > 0 exists for almost all 2. 
For n > 1, let 


ie +1/n)—F(a)), if F’(z) exists, een 


F(a+1)- F(a), otherwise, 


and let 


7 F' (2), if F' (a) exists, 
f(z) = i +1)— F(z), otherwise. 


Then f,(z) + f(x) pointwise on R as n > oo, and f = F”’ almost every- 
where. Hence, we may use Fatou’s lemma to conclude 


[ f (2) de < limit [ ce 
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But F is continuous, so applying the first fundamental theorem of calculus 
in Chapter 4 to F yields 


at+l1/n b+1/n 
nf F(«)dx — F(a) and nf F(a) dx — F(b). 
a b 


Moreover, for n > 1, 


[ OC nf Pia) dee nf Fe) ite 


+1/n 
b+1/n atl1/n 

=nf P(e) de—n | F(a) dx. 
b a 


Passing to the limit, we obtain 


[ro dz = [s@ os imiat [ fn(x) dx = F(b) — F(a). 


Since every continuous bounded variation function is the difference of two 
continuous increasing functions, we conclude F” exists almost everywhere and 
is integrable, when F' is continuous bounded variation. Moreover, modifying 
the above argument establishes (Exercise 6.5.8) 


b 
/ |F’(x)| dx < vp(a, 6) (6.5.8) 


for F continuous bounded variation on [a, b]. In particular, applying this last 
result on [c, d] C (a,b), we conclude: If F is locally absolutely continuous on 
(a, b), then F” exists almost everywhere; thus, F is a primitive of a measurable 
and locally integrable f = F’. 

Conversely, if f is measurable and locally integrable on (a, b), the Lebesgue 
differentiation theorem implies F' given by (6.5.1) is a primitive of f. This 
establishes Theorem 6.3.38 and Theorem 6.3.4. 

If F is Lipschitz with |D.F(x)| <A, a<c<a«< band fr, n> 1, areas 
in (6.5.7); then |fn(x)| < A for all x in (a,b), n > 1. Hence, by the dominated 
convergence theorem, 


[ soya + [s@ ee = [Fe de, 


Repeating the argument leading to (6.5.8) yields (6.5.3). This establishes 
(6.5.3) for F Lipschitz. 

The end is now in sight. Given F' continuous increasing on [a, b], the plan 
is to approximate F by Fy Lipschitz. 

Suppose F is continuous increasing on [a,b] and A > Ap = Da F'(b), and let 
W), Fy be as in Theorem 6.4.3. Then F(a) = F\(a) and F(b) = Fy(b) and 
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0 < D.F)(x) < » fora <c< a <b. For each component interval (c,d) of 
W), we have D.F)(d) = D.F(d) = 4; thus, \(d—c) = F(d)— F(c). Summing 
over all component intervals of W, 


A length (Wy) = vr(W)) < up(a, db). (6.5.9) 


If c € UX, there is a sequence %, — c with t, ¢ Wy, n > 1. Thus given 
c € US, there is a sequence 2, + c with D.Fy(t,) = D-F (an), n > 1. If 
moreover F’(c) and F\(c) both exist, it follows that F’(c) = FY (c). 

Since F' is continuous increasing, F’(x) > 0 exists for almost all x in (a, b). 
Since F\ is Lipschitz, F\ is absolutely continuous (Exercise 6.5.5); hence, 
F\ (ax) exists for almost all 2 in (a,b). Let N be the negligible set on whose 
complement both F” and FY exist, and let {F’ # F\} be the set on whose 
complement both F’ and FY exist and are equal. We conclude 


{F’ AF CWUN. 


But (6.5.3) is valid for F\; hence, 


b 
FQ)- F(a)= [| Kaas, 


b 
< f I(@)- P@lde 


< | F(e)de + | Adz 
{F’AF 3} {F’AFy} 


< F'(x) dx + Alength(W)). (6.5.10) 
Wr 


So far, (6.5.10) is valid for any \ > Ap = D,F(b) and any F continuous 
increasing on [a,b]. But F' increasing implies ur(a,b) < 00; hence, by (6.5.9), 
length (W)) + 0 as \ > oo. Since F” is integrable, the first term in (6.5.10) 
vanishes as \ — oo. 

If F' is absolutely continuous, it follows that ur(W)) > 0 as A > oo. 
By (6.5.9) again, Alength(W)) — 0 as A — oo; hence, the second term 
in (6.5.10) vanishes as \ — oo. This establishes (6.5.3) for F’ absolutely 
continuous and increasing on [a, b]. By Theorem 6.5.1, this establishes (6.5.3). 
This completes the proof of Theorem 6.5.2. 

Now let F' be locally absolutely continuous on (a, b) with F’(x) = 0 almost 
everywhere on (a,b). Applying Theorem 6.5.2 on [c, d] C (a,b) yields F(c) = 
F(d); thus, F is constant. This establishes Theorem 6.3.5. 

Let f be continuous on (a,b), and suppose F is locally absolutely contin- 
uous on (a,b) with F’ (x) = f(x) for almost all x. Fix c in (a,b). Then F’ and 
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F., are both primitives of f and hence differ by a constant. From Chapter 4, 
since f is continuous, F’(a2) = f(x) for all x in (a,b). Hence F’ (a) = f(x) for 
all x in (a,b). Conversely, suppose F’(#) = f(x) for all x in (a, 6). Since f is 
bounded on |c, d] C (a,b), the mean value theorem implies that F' is Lipschitz 
on [c,d], hence absolutely continuous on |c, d], and hence locally absolutely 
continuous on (a,b). This establishes Theorem 6.3.2. 


Exercises 


6.5.1. If F is defined on [a,b] and U is a countable disjoint union of open 
sets U,, n > 1, in (a,b), then 


6.5.2. Show that the definitions of absolute continuity in §6.3 and in this 
section are equivalent. 


6.5.3. If a = 2% < 4 < ++: < &, = 0 is a partition of [a,b], ur(a,b) = 
Up(@o,%1) +--+ + Ur(@n_1,2n). Conclude F' absolutely continuous on |{a, b] 
implies vp(a, b) < ov. 


6.5.4. With vr(x) = vr(a,x), show v,,(U) = ur(U). 
6.5.5. Show F' Lipschitz on [a,b] implies F’ absolutely continuous on [a, dJ. 


6.5.6. The estimates (6.5.5) and (6.5.6) correspond to two of the four Dini 
derivatives. Write down and prove the estimates corresponding to the other 
two Dini derivatives. 


6.5.7. Let C be the Cantor set, F’ the Cantor function, and U) as in (6.5.5). 
Let (cx, dx), & > 1, be the component intervals of [0,1] \C, and let Z = {c;, : 
k > 1}. Show 


(Uv. =C\Z. 


A>0 


6.5.8. Let F' be continuous bounded variation on [a,b]. Then (6.5.8) holds. 


6.6 The Lebesgue Differentiation Theorem 


Now we turn to the proof of the first fundamental theorem, Theorem 6.3.1. 
Without loss of generality, by restricting to a subinterval [c,d] C (a,b), we 
may assume f is integrable on (a,b). Then F' = F; is defined on |[a, d]. 
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Assume first (6.3.5) holds, and let f, be given by (6.5.7). Then f is almost 
everywhere equal to the pointwise limit of the continuous functions f,,; hence, 
f is measurable. 

Conversely, if f is measurable, F = F, is absolutely continuous on [a, }]; 
hence, F” exists almost everywhere on (a, 6). If g = f — FY’, then g is measur- 
able integrable on (a,b), and by Theorem 6.5.2, 


G(a)= | g(t) dt = 0, ax<a<b. 


But this implies 


[a dz = 0 


for every open interval U in (a,b). Let U C (a,b) be open. Applying this to 
each component interval of U and summing over these intervals yields the 
same equality but now for any U C (a,b) open. 

Given any A C (a,b) and € > 0, there is an open U D A with length (U) < 
length (A) + €. Intersecting U with (a,b) if necessary, we may assume U C 
(a,b). Hence there is a sequence of open supersets U, C (a,b) of A with 
length (U,,) — length (A). 

If A is measurable, then 


length (U, \ A) = length (U,,) — length (A) — 0. 


By Theorem 6.5.3, the integral of g over U,, \ A goes to 0 as n + oo. Since 
by linearity, 


o- ff gtayde= fale dv + ff g(e) ae. 


n 


it follows that the integral of g over A vanishes; hence, 


I Pea I areiaee 


Choosing A = {+g > €} = {a : tg9(x) > €} yields 


0= J gt (x) dz = I g(x) dz > elength({+g>e}) (6.6.1) 


Eg>e} Eg>e} 


hence, {+g > €} is negligible for all « > 0, or g = 0 almost everywhere on 
(a, b). 

Above we derived the first fundamental theorem from the second funda- 
mental theorem. One can instead derive the second fundamental theorem 
from the first fundamental theorem. 
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Indeed both approaches start by establishing the almost everywhere exi- 
stence and integrability of F’ for F' continuous increasing. Then the above 
calculation uses the absolute continuity of G to derive the first fundamen- 
tal theorem from the second fundamental theorem. Conversely, by the first 
fundamental theorem, 


Gla) = F(a) - [ Fi@)ae, a<a<b, 


satisfies G’(x) = 0 almost everywhere, and (6.5.8) implies G is increasing. 
Using the absolute continuity of G (Exercise 6.6.7), G is a constant, obtaining 
the second fundamental theorem. 

This latter approach necessitates a direct proof of the first fundamen- 
tal theorem. This standard proof, which we do not include, is based on the 
Lebesgue density theorem, approximation by simple functions, and the max- 
imal inequality, Exercise 6.4.2. 

Here is a basic estimate. The proof presented is that in [13]. When F is 
continuous increasing, this is a consequence of the sunrise lemma via (6.5.6). 


Theorem 6.6.1 (Fundamental Lemma). Let F be continuous on R, 
A>0 and A C R arbitrary. If F’(c) exists and satisfies |F’(c)| < X for 
cé A, then 


length (F(A)) < Alength (A). 


Given € > 0, select open U. D A with length (U.) < length (A) + e. For 
each c € R, let 


I, = {x €U.:|D-F(t)| < 2 for 0 < |t—e¢| < |x — ce}. 


If |F’(c)| < A, then I, is the largest nonempty open interval I. C U,. centered 
at c on which |D,.F'(«)| < A holds. Let 


U =|Jle:c€ A}. 


Then U is open (§2.1) and ACU C U,. If (a, 6) is a component interval of 
U, then 
(a,b) =(J{Z.: ce AN (a,b)}. 


Let [c,d] C (a,b). Since {I, : « € AN (a,b)} is an open cover of [c,d] and 
[c.d] is a compact set (Theorem 2.1.5), there is a finite subcover of [c, d]; thus, 
there are cy < cg < +--+ < cy in AN (a,b) with 

[c, d| ChUhmhU...UlIn =T, 


where J, = I,,, k =1,...,N. 
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By discarding intervals, we may assume c, < co <--- < cy and 
Tp OTe 1 (Ck; Ch-41)s 1l<k<k4+1<QN, 


are nonempty. We may also assume Jj, I2,..., I are all bounded; otherwise, 
length (A) = co and there is nothing to prove. 
For each k = 1,...,N, define 


FR (a) = F(ce) + X(@ — cr), xeR. 


The graphs of these functions are the lines in Figure 6.3. Let M; and mz 
denote the sup and inf of Pe over I,, kK = 1,...,N. Since I, is centered 
at cz, My, and mz are also the sup and inf of FL over Ip, k = 1,...,N. 
Moreover, F(Ix) C [mx, Mz], & = 1,...,.N; hence, F(1) C [m,, M*], where 


M* = max(M,..., Mwy), Ms = min(m,,...,myn). 


Let I*, c*, and F** denote the intervals and centers and lines J;, c;, and F5~, 
where i = min{k : M, = M*}, and let J., cx, and F denote the intervals 
and centers and lines J;, cj, and Fj", where 7 = min{k : My = Ms}. 


Fig. 6.3 The Fundamental Lemma [13] 


Define G*, G, on R as follows: If c* > c,, G* = F**, G, = F7; otherwise, 
ifc* <c,, G* = F*-, G, = F>. Then 


G, — G* = Alc, — | + F(ex) — F(C) > la — | (A —|De, F(c*)]). 


We claim (m,,M*) C G,(Z). The realization that F(Z) is contained in the 
image G.(I) of a single line of slope +A—as suggested by Figure 6.3—is the 
key idea of this proof. 

Now note |DaF(b)| < A and |D,F(c)| < A together imply |DaF(c)| < 4. 
Let « be in Ip NIp41 with cy < & < cy4 1. Then 


Do, F(a)] SA, |DeF (ce41)| SA 
imply 
|De, F (cr+1)| <A, Ll<k<k4+1<N. 


Applying this through all c, between c, and c*, we obtain |D., F(c*)| < X. 
Putting this all together, G, > G*; hence, 
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sup G, > supG, > supG* = M* >m, = inf Gy > inf Gs. 
I Ix rT 4 


This establishes the claim. 
Since G, is a line with absolute slope A, we conclude 


length (F'((c, d])) < length (F'\(Z)) < length (G,(I)) = Alength (1) < A(b— a). 
Allowing [c, d] to fill (a,b), we obtain 

length (F(a, 6))) < A(b— a). 
Summing over all component intervals (a,b) of U, we arrive at 


length (F sD length (F =o A(b — a) 


a eee ) < Alength nie < (length (A) + €). 


Since € > 0 is arbitrary, the result follows. 
Note the next result is a direct generalization of the second fundamental 
theorem of §4.4 and §6.3. 


Theorem 6.6.2. Let F be continuous on [a,b]. Then? 
length (F'([a, b])) — length (F(N)) < |F"(x)| dx < vr(a,), 
{F’ exists} 


where N = (a,b) \ {F" exists}. 
Let {F” = 0} = {c € (a,b) : F’(c) exists and F’(c) = 0}. By the Funda- 


mental Lemma, 
length (F'({F’ = 0})) < A(b—-a) 
for all A > 0; hence, F'({F” = 0}) is negligible. 
Given 6 > 1, R* is the disjoint union of intervals [0”,0"*+!), n € Z; hence, 
{F’ exists} is the disjoint union of the measurable sets {F” = 0} and 


An = {c € (a,b): F’(c) exists and 6” < |F’(c)| < 6"*1}, ne Z. 


Apply the fundamental lemma with A = A, and \ = 6"*!, yielding 
length (F(An)) < 6"*" length (An) < nsof iP x)| dx, ne Z. 
Summing over n € Z, we obtain 


length (F({|F"| > 0})) < S20 I, IF" (a Ndr <0 | (Paige 
F’ exists} 


neZ 


° N may not be negligible: For example, F’ may exist nowhere! 
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Since F'({F” = 0}) is negligible and @ > 1 is arbitrary, we conclude 


length (F({F" exists})) < i |F’ (ax)| da. 
{F’ exists} 
Now F'((a,b)) = F({E” exists}) U F(N); hence, 


length (F(a, b])) < length (F'(.V)) + .. - |F"(a)| dx. 


By (6.5.8), the result follows. 
Let F be continuous on [a,b]. We say F' has Lusin’s property if FN) is 
negligible for every negligible N C (a,b). By Exercise 6.6.2, absolute con- 
tinuity implies Lusin’s property. Applying the above result to an absolutely 
continuous increasing function yields an alternate path to Theorem 6.5.2 
which avoids the Lipschitz approximation argument used in 86.5. 
Here is another consequence of the above result. 


Theorem 6.6.3 (Banach-Zaretski). Let F' be continuous and bounded 
variation on [a,b]. Then F is absolutely continuous iff F satisfies Lusin’s 


property. 


Indeed, if F' is bounded variation, then F’ exists almost everywhere, N 
above is negligible, and F” is integrable over (a,b). If F also satisfies Lusin’s 
property, then F'(V) is negligible. Hence, for (c,d) C (a,b), 


d 
|F(d) — F(o)| < length (F((c, d])) < / |F"(«)| de, 


from which (6.5.2) with f = F” readily follows. Thus F' is absolutely contin- 
uous. 

Given F' on [a,b] and y real, let #(y) be the number of reals x in (a, ) 
satisfying F(x) = y. Then #r(y) is zero, a natural, or oo for each y real. 
More generally, for U C (a,b) open, let #ru(y) be the number of reals x in 
U satisfying F(a) = y. This counting function # Ff is the Banach indicatriz. 
Here is another result of Banach [1]. 


Theorem 6.6.4. Let F be continuous on [a,b]. Then 


. #r(v) dy = vr(a,d). 


More generally, for U C (a,b) open, 


i" Pavey aa). 


310 6 Generalizations 


This is valid whether vr(a,b) or vr(U) are finite or infinite. Also by 
definition of subgraph in Chapter 4, the integrals of 0 < #r < oo and 
0 < #ru < o are well defined, without having to show #r, #ru are 
measurable (although they are). 

Since U is the disjoint union of its component intervals (c,d), and #F,u is 
the sum of #p(c,a) Over all (c,d), the second equality follows from the first. 

To establish the first equality, given n > 1, leta=29 < 41 <---<a@,=0b 
be a partition of (a,b) with I; = (aj-1,2;), i =1,...,n, all of equal length, 


and set Jj = (%0,%1), Ji = [i-1, Ui), i = 2,...,n. Then the intervals J;, 
i=1,...,n are disjoint with union (a,b). 
Define 


nly) = lew (y) + levy (y) +-°> + Leu, (y)- 


Then #,(y) = N iff y is in N of the sets F(J;), 1 <i <n, which happens iff 
there is a partition a < cy <---< cy < b satisfying F(c,) =y,k =1,...,N. 
But this implies #r(y) > N. Thus #r(y) > #n(y), n > 1; hence, #r(y) > 


Sup, fan (y). 

Given y real, suppose a < cy <--- < cy < bis a partition satisfying 
F(ck) = y, k = 1,...,N. If n > 1 is selected with (b — a)2~” less than 
the mesh of a < cy < -:- < cy < b, exactly N subintervals J; intersect 


{a,...,enw}; then #on(y) > N; hence, sup, #2»(y) > N. Taking the sup 
over all such N yields sup,, #2.(y) > #r(y). 

But #2n(y), n > 1, is increasing; Hence, #on(y), n > 1, converges to 
sup,, #2»(y) = #r(y). By Exercise 6.2.3, #,(y) is measurable for n > 1; 
hence, so is #r(y). 

By Exercise 6.6.1 and Exercise 6.5.3, for n > 1, 


[#a(o) dy = Ytenatn (P49) 
= =} tenet F \< Yoel = up(a, Db). 


Sending n — oo through powers of 2 yields, by the monotone convergence 
theorem, 


i #r(v) dy < vr(a,d). 


Conversely, let M; and m; denote the max and min of F' over [x;_-1, 2;], 
i =1,...,n. Then given a finite disjoint union U of open intervals (cx, dx), 
k=1,...,N, in (a,b), with q #d;, k,j =1,...,N, by Exercise 6.6.8, we 
can select n > 1 such that 


var(F,U) < 3 (MM; —m;,). (6.6.2) 


i=l 
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Thus 


n n 


var(F,U) < S0(M; -—mi) = > (sw F — inf F) 


i=l i=l 


= J" length (F() = / ” Seas 7, * avai 


Taking the sup over all such U establishes the result. 
If vp(a, b) < co, the above result shows #p is integrable over R. This can 
be leveraged to yield the following generalization of Exercise 4.4.25. 


Theorem 6.6.5. Let F be absolutely continuous on [a,b], and let g be non- 
negative measurable on R. Then 


fore) b 
/ gy)teu) ay = | g(F(a))|F" (a)| dx. 


To see this, start with m < M and let U = F~!((m,M)). Then 
M 


#ety)dy = f ” Seip | LF’ (a) de. 


m F-1(m,M) 


Now let U be any open set in R, apply this to each component interval 
(m, M) of U, and sum over component intervals. This yields 


b 
[toda fields f weer elas 


If A C R is measurable, select a sequence (U,,) of open supersets of A with 
length (U,, \ A) > 0. Then 


a 


b b 
:: Hey) dy = | 1y, (F(a) |F"(2)| dx > / 14(F(2))|F"(«)| ae 


for n > 1. Since #F is integrable, sending n — oo, we obtain 


b 
| #r(y) dy > / La(F(2))|F(e)| de. 
A a 


Now replace A by U,, \ A in this last inequality, to get 


b 
Ap(y) dy > / 1y,\alF(e))E"(a)| de. 


U,\A a 
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Since # Fr is integrable, the left side goes to zero as n + oo and hence so does 
the right side. Since 


ly, (F(x)) — 1a(F(2)) = 10, \a(F (2), 
we conclude 
b 
| #F(y) dy = / 1a(F(x))|F’ (x)| da. 
A a 


By linearity, this implies the result for g simple measurable. By Theo- 
rem 6.1.4, the result follows. 


Exercises 


6.6.1. If F is continuous on [a,b] and U C (a,b) is open, then 
length (F(U)) < vr(U). 
When F is continuous increasing, this is an equality. 


6.6.2. If Fis locally absolutely continuous on R, then F' has Lusin’s property. 
(This result is false for the Cantor function as F(C) = [0, 1].) 


6.6.3. Let F' be continuous bounded variation on |a, b]. Use (6.5.8) to obtain 
; d 
|E"(x)| < Gy Ps 2); for almost all x € (a,b). 
6.6.4. Use (6.5.2) and (6.5.3) to derive 


d 
qtr (a2) = |F"(a)I, for almost all x € (a,b). 


for F absolutely continuous on [a,b]. Conclude (6.5.4). 


6.6.5. Let F be absolutely continuous on [a,b] and An C (a,b) arbitrary, 
n > 1. Then length (A,,) > 0 implies length (F'(An)) > 0. 


6.6.6. Let F’ be absolutely continuous on [a,b]. Then M C (a,b) measurable 
implies F(/) measurable. (Since the Cantor set C is negligible, every subset 
of C is measurable. But for the Cantor function F(C) = [0,1], so either 
this conclusion is false for the Cantor function, or every subset of [0,1] is 
measurable.) 


6.6.7. Let F' be absolutely continuous increasing on [a, b] with F’ = 0 almost 
everywhere. Use (6.5.6) and Exercises 6.6.1 and 6.6.2 to directly show F is 
a constant. 
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6.6.8. With notation as in (6.6.2), ifce € Ji, and d € Jj,i< j, then 


\F(d) — F()| < S2 (Me— mi). 


iSCSj 


If U is a finite disjoint union of intervals (cz, d,), k = 1,...,N, in (a,b), 
with the mesh of a = x < 41 <-+: < @ = 6 less than min{|c; — dx] : J, 
k=1,...,N}, then (6.6.2) holds. 


Appendix A 
Solutions 


A.1 Solutions to Chapter 1 


Solutions to Exercises 1.1 


1.1.1 Suppose that f : X — Y is invertible. This means there isag: Y — X 
with g(f(x)) = x for all x and f(g(y)) = y for all y € Y. If f(a) = f(a’), then 
x = g(f(x)) = g(f(x’)) = a’. Hence f is injective. If y € Y, then x = g(y) 
satisfies f(a) = y. Hence f is surjective. We conclude that f is bijective. 
Conversely, if f is bijective, for each y € Y, let g(y) denote the unique 
element x € X satisfying f(x) = y. Then by construction, f(g(y)) = y. 
Moreover, since x is the unique element of X mapping to f(x), we also have 
a = g(f(x)). Thus g is the inverse of f. Hence f is invertible. 


1.1.2 Suppose that gi : Y — X and gz: Y — X are inverses of f. Then 
f(gi(y)) = b = flge(y)) for all y € Y. Since f is injective, this implies 
gi(y) = gly) for ally € Y,ie., gi = go. 


1.1.3 For the first equation in De Morgan’s law, choose a in the right side. 
Then a lies in every set A°, A € F. Hence a does not lie in any of the sets 
A, A € F. Hence a is not in J{A: A € F}, ie., a is in the left side. If a is 
in the left side, then a is not in U{A: A € F}. Hence a is not in A for all 
A € F. Hence a is in A® for all A € F. Thus a is in (){A°: A€ F}. This 
establishes the first equation. To obtain the second equation in De Morgan’s 
law, replace F in the first equation by F’ = {A°: Ae F}. 


1.1.4 U{x} = {t:t € x} =z. On the other hand, {\ z} is a singleton, so it 
can equal « only when z is a singleton. 


1.1.5 If x € b, there are two cases, x € aand x ¢ a. In the first case, x € and, 
while in the second, x € aUb but not ina, sor €aUbna. 
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1.1.6 If « € Ae Ff, then x € F by definition, so A C F. If x € ()F, then 
xe AEF; hence (\F CA. 


1.1.7 The elements of (a,b) are {a} and {a,b}, so their union is U(a,b) = 
{a,b}, and their intersection is (\(a,b) = {a}. Hence UU (a,b) = aU, 
Ula, b) = aN b, Ufa, b) = a, and ()( (a, b) = a. By Exercise 1.1.5, b is 
computable from a, aU b, and an b. 


1.1.8 If a € S(z) = xU {2}, then either a € x or a = x. In the first case, 
aC @, since « is hierarchical. In the second, a= aC x C S(x). Thus S(z) is 
hierarchical. 


1.1.9 If a=c and b=d, then 


(a,b) = {ta}, fa, b}} = {{e}, te, dt} = (Gd). 


Conversely, suppose (a,b) = (c,d). By Exercise 1.1.8, 


a=(Jf \(a,b) =e =C 


alb=|JU@>) =(JUlG@ =cud 
anb=(\U(a,) =(\Wed =cnd. 


By Exercise 1.1.7, 


and 


and 


b= ((aUb) \ a) U (aNb) = ((cUd) \ ce) U(end) =d. 


Thus a =c and b=d. 


Solutions to Exercises 1.2 


1.2.1 a0 = a0 + (a—a) = (€0+ a) —a = (AD + a1) -—a=a(04+1)-a= 
al—a=a-—a=0. This is not the only way. 


1.2.2 The number 1 satisfies la = al = a for all a. If 1’ also satisfied 1’b = 
bl’ = 6 for all 6, then choosing a = 1’ and b = 1 yields 1 = 11’ = 1’. Hence 
1 = 1’. Now, suppose that a has two negatives, one called —a and one called b. 
Then a+b = 0, so —a = —a+0 = —a+(a+b) = (—a+a)+b = 0+6 = b. Hence 
b = —a, so a has a unique negative. If a 4 0 has two reciprocals, one called 1/a 
and one called b, ab = 1, so 1/a = (1/a)1 = (1/a)(ab) = [(1/a)a]b = 1b = b. 


1.2.3 Since a + (—a) = 0, a is the (unique) negative of —a which means 
—(-a) =a. Also a+ (—1)a = la + (—1)a = [14+ (—L)]a = 0a = 0, so (—1)a 
is the negative of a or —a = (—1l)a. 
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1.2.4 By the ordering property, a > 0, and b > 0 implies ab > 0. If a < 0 
and b > 0, then —a > 0. Hence (—a)b > 0, so ab = [—(—a)]b = (—1)(—a)b = 
—((—a)b) < 0. Thus negative times positive is negative. If a < 0 and b < 0, 
then —b > 0. Hence a(—b) < 0. Hence ab = a(—(—b)) = a(—1)(—b) = 
(—1)a(—b) = —(a(—b)) > 0. Thus negative times negative is positive. Also 
1 = 11> 0 whether 1 > 0 or 1 < 0 (1 £0 is part of the ordering property). 


1.2.5 a < bimplies b—a > 0 which implies (b+c)—(a+c) > 0 or b+c> ate. 
Also c > 0 implies c(b — a) > 0 or be — ac > 0 or bc > ac. Ifa < band b<e, 
then b— a and c— b are positive. Hence c— a = (c— b) + (b— a) is positive or 
c >a. Multiplying a < b by a > 0 and by b > 0 yields aa < ab and ab < bb. 
Hence aa < bb. 


1.2.6 If 0 < a < b, we know, from above, that aa < bb. Conversely, if 
aa < bb, then we cannot have a > b because applying 1.2.5 with the roles of 
a, 6 reversed yields aa > bb, a contradiction. Hence a < b iff aa < bb. 


1.2.7 


A. By definition, inf A < x for all x € A. Hence — inf A > —2 for all a € A. 
Hence — inf A > y for all y € —A. Hence — inf A > sup(—A). Conversely, 
sup(—A) > y for all y © —A, or sup(—A) > —« for all « € A, or 
—sup(—A) < x for all « € A. Hence —sup(—A) < inf A which implies 
sup(—A) > —inf A. Since we already know that sup(—A) < — inf A, we 


conclude that sup(—A) = —inf A. Now, replace A by —A in the last 
equation. Since —(—A) = A, we obtain sup A = — inf(—A) or inf(—A) = 
— sup A. 


B. Now, sup A > « for all x € A, so (sup A) +a > x+<a for all x € A. Hence 
(sup A) +a> y for y € A+a, so (sup A) + a > sup(A + a). In this last 
inequality, replace a by —a and A by A+<a to obtain [sup(A + a)] —a > 
sup(A+a—a), or sup(A+a) > (sup A)+a. Combining the two inequalities 
yields sup(A +a) = (sup A) +a. Replacing A and a by —A and —a yields 
inf(A + a) = (inf A) +a. 

C. Now, supA > « for all x € A. Since c > 0, csupA > cz for all 
x € A. Hence csupA > y for all y € cA. Hence csupA > sup(cA). 
Now, in this last inequality, replace c by 1/c and A by cA. We obtain 
(1/c) sup(cA) > sup A or sup(cA) > csup A. Combining the two inequal- 
ities yields sup(cA) = csup A. Replacing A by —A in this last equation 
yields inf(cA) = cinf A. 


Solutions to Exercises 1.3 


1.3.1 Let S be the set of naturals n for which there are no naturals between 
n and n+ 1. From the text, we know that 1 € S. Assume n € S. Then we 
claim n+ 1 € S. Indeed, suppose that m € N satisfiesn+1<m<n+2. 
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Then m 4 1,so m—1 € N (see §1.3) satisfies n <m—1< n+1, contradicting 
n € S. Hence n+ 1 € S,so S is inductive. Since S C N, we conclude that 
S=N. 


1.3.2 Fix n © N, and let S={x €R: nz € N}. Then 1 € S since nl =n. 
If a € S, then nz € N, so n(a#4+ 1) = na+ne€N (since N+ N CN), so 
x+1e¢8. Hence S is inductive. We conclude that S > N or nm € N for all 
meEN. 


1.3.3 Let S be the set of all naturals n such that the following holds: If 
m>nandmeéN, then m—n €N. From 81.3, we know that 1 € S. Assume 
n € S. We claim n+1 € S. Indeed, suppose that m > n+1andm € N. Then 
m—1>n. Since n € S, we conclude that (m—1)—n € N or m—(n+1) EN. 
Hence by definition of S,n+1¢€ S. Thus S is inductive, so S = N. Thus 
m >nimplies m—n€N. Since, form,n EN, m>n,m=n, orm <n, 
we conclude that m—n € Z, whenever m,n € N. Ifn € -NandmeN, 
then —n € Nand m—n=m+(-n)E€NCZ.Ifme —-NandneN, 
then —m € N and m—n = —(n + (—m)) N c Z. If n and m are both 
in —N, then —m and —n are in N. Hence m — n = —((—m) — (=—n)) € Z. If 
either m or n equals zero, then m—n € Z. This shows that Z is closed under 
subtraction. 


1.3.4 If n is even and odd, then n +1 is even. Hence 1 = (n+ 1) —n is even, 
say 1 = 2k with k € N. But & > 1 implies 1 = 2k > 2, which contradicts 
1 < 2. Ifn = 2k is even and m EN, then nm = 2(km) is even. If n = 2k —1 
and m = 27 — 1 are odd, then nm = 2(2kj7 — k — 3 +1) —1 is odd. 


1.3.5 For all n > 1, we establish the claim: [fm € N and there is a bijection 


between {1,...,n} and {1,...,m}, then n = m. For n = 1, the claim is 
clearly true. Now, assume the claim is true for a particular n, and suppose 
that we have a bijection f between {1,...,n +1} and {1,...,m} for some 


m €N. Then by restricting f to {1,...,n}, we obtain a bijection g between 
{1,...,n} and {1,...,4k —1,k+1,...,m}, where k = f(n +1). Now, define 
h(i) =tif 1 <i<k-land h(i) =i—-lLifk+1<1i<m. Then his a bijection 
between {1,...,k—1,k+1,...,m} and {1,...,m—1}. Hence hog is a bijection 
between {1,...,n} and {1,...,m—1}. By the inductive hypothesis, this forces 
m—-1=norm=n-+1. Hence the claim is true for n+ 1. Hence the claim is 
true, by induction, for all n > 1. Now, suppose that A is a set with n elements 
and with m elements. Then there are bijections f : A > {1,...,n} and 
g:A-— {1,...,m}. Hence go f~! is a bijection from {1,...,n} to {1,...,m}. 
Hence m = n. This shows that the number of elements of a set is well defined. 
For the last part, suppose that A and B have n and m elements, respectively, 
and are disjoint. Let f : A> {1,...,n} andg: B > {1,...,m} be bijections, 
and let h(i) =it+tn,1<i<m. Then hog: Bo {n+l,...,.n+mbhisa 
bijection. Now, define k: AUB — {1,...,r +m} by setting k(x) = f(x) if 
x € Aand k(x) = ho g(a) if « € B. Then k is a bijection, establishing the 
number of elements of AU B isn+™m. 
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1.3.6 Let A C R be finite. By induction on the number of elements, we show 
that max A exists. If A = {a}, then a = max A. So max A exists. Now, assume 
that every subset with n elements has a max. If A is a set with n+1 elements 
and a € A, then B = A \ {a} has n elements. Hence max B exists. There 
are two cases: If a < max B, then max B = max A. Hence max A exists. If 
a > max B, then a = max A. Hence max A exists. Since in either case max A 
exists when #A = n+ 1, by induction, max A exists for all finite subsets A. 
Since —A is finite whenever A is finite, min A exists by the reflection property. 


1.3.7 Let c = sup S. Since c— 1 is not an upper bound, choose n € S with 
c-1l<n<c.IfceS, then c= maxS and we are done. Otherwise, c Z S, 
and c—1<n<_c. Now, choose m € S with c—1<n<m < c concluding 
that m —n = (m—c) — (n —c) lies between 0 and 1, a contradiction. Thus 
c=maxS. 


1.3.8 If we write z = n/d with n © N anddeEN, then dt =n€ N,so S$ 
is nonempty. Conversely, if dx = n € N, then « = n/d € Q. Since S CN, 
d = min S exists. Also if g € N divides both n and d, then x = (n/q)/(d/q), so 
d/q € S. Since d = min S, this implies g = 1; hence n and d have no common 
factor. Conversely, if « = n/d with n and d having no common factor, then 
reversing the argument shows that d = min S. Now gq|d and q|n + kd imply 
q\n and q|d; hence q = 1. Thus x = (n+ kd)/d with n+ kd and d having no 
common factor; hence d= min{j € N: j(a+k) € Z} ord=d(a +k). 


1.3.9 If yq < x, then g < x/y. So {q EN: yq < x} is nonempty and bounded 
above; hence it has a max, call it g. Let r = «—yq. Then r = 0, or r € R™. If 
r>y, then x—y(q+1)=r—y>0,soq+1 € S, contradicting the definition 
of gq. Hence O< r< y. 


1.3.10 1 € A since (1,a) € f, and n € A implies (n, x) € f for some x implies 
(n+1,g(n,x)) € f implies n+ 1 € A. Hence A is an inductive subset of N. 
We conclude that A = N. To show that f is a mapping, we show B is empty. 
If B were nonempty, by Theorem 1.3.2, B would have a smallest element, call 
it b. We show this leads to a contradiction by using b to construct a smaller 
inductive set f. Hither b = 1 or b > 1. If b = 1, let f =f {(0,z)}, where 
z#a.lfb>1, let f = f ~ {(b,z)}, where z 4 g(b—1,c) and (b—1,e) € f. 
We show f is inductive. Let (n,xz) be any pair in f C f; then, since f is 
inductive, we know (n+1,9(n,x)) € f. We must establish (n+1, g(n,x)) lies 
in the smaller set f. If b= 1, sincen+1 never equals 1, (n+ 1, 9(n,x)) € 7 
Ifb>l1andn+1#b, then (n+1,g9(n,2)) € f, since f and f differ only by 
{(b,z)}. Ifb > 1 andn+1=6, then (b—1,c) € f and (b)— 1,2) € f; since 
b—1¢ B, it follows that c = x. Since (b, z) 4 (n+ 1,g(n,x)), we conclude 
(n+1,9(n,x)) € f. This shows f is inductive. Since f is smaller than f, this 
is a contradiction; hence B is empty and f is a mapping satisfying f(1) = a, 
and f(n +1) = g(n, f(n)), n > 1. If kh: N > R also satisfied h(1) = a and 
h(n +1) = g(n,h(n)), n > 1, then C= {n EN: f(n) = h(n)} is inductive, 
implying f = h. 
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1.3.11 By construction, we know that at! = q”qa for all n > 1. Let S be 
the set of m € N, such that a"t™ = a"a™ for alln € N. Then 1 € S. If 
m € S, then a®t™ = a™a™, go aM tO) = glrtm)+1] — grtmg = gaa = 
anamtt, ea m+1e€S. Thus S is inductive. Hence S = N. This shows 
that a” = a"a™ for all n,m € N. If n = 0, then a?t™ = a™a™ is 
clear, whereas n < 0 implies a?t™ = a™t™q-"qr = artm—nqn = ama”, 
This shows that a"t™ = a™a™ for n € Z and m € N. Repeating this last 
argument with m, instead of n, we obtain a”t™ = a"a™ for all n,m € Z. 
We also establish the second part by induction on m with n © Z fixed: If 
m = 1, the equation (a")” = a”” is clear. So assume it is true for m. Then 
(a™)m™+1 — (a")™(a")! = aa” = a™™” = g™M™+1), Hence it is true for 
m-+1, hence by induction, for all m > 1. For m = 0, it is clearly true, whereas 
form <0, (a")™ = 1/(a")7™ =1/a-"™ =a" 


1.3.12 By 1+2+---+n, we mean the function f : N > R satisfying f(1) = 
and f(n+1) = f(n) + (n+ 1) (Exercise 1.3.10). Let h(n) = n(n + 1)/2, 
n> 1. Then A(1) = 1, and 

(n+ 1)(n + 2) 


se a = h(n +1). 


n(n +1) 


h(n) + (n+1) = 5 


Thus by uniqueness, f(n) = h(n) for all n > 1. 


1.3.13 Since p > 2, os and n < 2" < p” for alln > 1. If p*m = p’q 
with k < j, then m = p)—*q = pp’ *"1q i is divisible by p. On the other hand, 
if k > j, then q is divisible by p. Hence p*m = p’q with m, q not divisible by 
p implies k = 7. This establishes the uniqueness of the hares of factors k. 
For existence, if n is not divisible by p, we take k = 0 and m = n. If n is 
divisible by p, then nj = n/p is a natural < p"~+. If n; is not divisible by 
p, we take k = 1 and m = nq. If n, is divisible by p, then ng = ni/p is a 
natural < p”~?. If nz is not divisible by p, we take k = 2 and m = ng. If ng 
is divisible by p, we continue this procedure by dividing nz by p. Continuing 
in this manner, we obtain n1,n2,... naturals with nj < p"~J. Since this 
procedure ends in n steps at most, there is some k natural or 0 for which 
m = n/p* is not divisible by p and n/p*~! is divisible by p. 


1.3.14 We want to show that S DN. If this is not so, then N \ S would 
be nonempty and, hence, would have a least element n. Thus k € S for 
all naturals k < n. By the given property of S, we conclude that n € S, 
contradicting n ¢ S. Hence N \ S is empty or SDN. 


1.3.15 Since m > p,m = pg+r with¢qe N,r e NU {0}, and O <r <p 
(Exercise 1.3.9). If r 4 0, multiplying by a yields ra = ma-—q(pa) € N since 
ma €WN and pa ec N. Hence r € Sq is less than p contradicting p = min Sy. 
Thus r = 0, or p divides m. 
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1.3.16 With a = n/p, p € Sq since pa = n € N, and m € S, since ma = 
nm/p €N. By Exercise 1.3.15, min S, divides p. Since p is prime, min S, = 
1, or min S, = p. In the first case, 1-a = a = n/p © N, ie., p divides n, 
whereas in the second case, p divides m. 


1.3.17 We use induction according to Exercise 1.3.2. For n = 1, the state- 
ment is true. Suppose that the statement is true for all naturals less than n. 
Then either n is a prime or n is composite, n = jk with 7 > 1 andk > 1. By 
the inductive hypothesis, 7 and k are products of primes, hence so is n. Hence 
in either case, n is a product of primes. To show that the decomposition for 
nm is unique except for the order, suppose that n = pi...pr = q1..-ds. By 
the previous exercise, since p; divides the left side, p: divides the right side, 
hence p, divides one of the q;’s. Since the q;’s are prime, we conclude that p; 
equals q; for some j. Hence n’ = n/p; < n can be expressed as the product 
p2...pr and the product q@ .-.qj-14j;4+1---ds. By the inductive hypothesis, 
these p’s and q’s must be identical except for the order. Hence the result is 
true for n. By induction, the result is true for all naturals. 


1.3.18 If the algorithm ends, then r, = 0 for some n. Solving backward, 
we see that rn_1 € Q, rn_-2 € Q, etc. Hence x = ro € Q. Conversely, if 
x € Q, then all the remainders r,, are rationals. Now (see Exercise 1.3.8) 
write r = N(r)/D(r). Then D(r) = D(r +n) for n € Z. Moreover, since 
0<ryp <1, N(rn) < D(rn). Then 


N (73) =D (—) =D (Qn41 + Tn41) =D (Tn41) >N (Tn41) : 


ln 


As long as r, > 0, the sequence (79,71, 72,---) continues. But this cannot 
continue forever, since N(r9) > N(ri) > N(r2) >... is strictly decreasing. 
Hence r,, = 0 for some n. 


1.3.19 Let B be the set of n € N satisfying A, is bounded above. Since 
A, = f~'({1}) is a single natural, we have 1 € B. Assume n € B; then Ay, 
is bounded above, say by M. Since An41 = An U f-t({n4+1}), Anyi is then 
bounded above by the greater of M and f~!({n + 1}). Hence n +1 € B. 
Thus B is inductive; hence B=N. 


1.3.20 Since 2!~! < 1!, the statement is true for n = 1. Assume it is true for 
n. Then 2(°+))-1 — 99-1 < On! < (n+1)n! = (n+ 1). 


1.3.21 Since n! is the product of 1, 2, ..., 1, and n” is the product of n, n, 

.., n, it is obvious that n! <n”. To prove it, use induction: 1! = 1!, and 
assuming n! <n”, we have (n+1)! = n!(n+1) < n™(n4+1) < (n4+1)"(n4+1) = 
(n+1)"*1. This establishes n! < n” by induction. For the second part, write 
n!=1-2----- nm=n-(n—1)----: 1, so (n!)? =n-1-(n—1)-2-...1-nor 
(n!)? equals the product of k(n + 1 —k) over all 1 < k <n. Since 
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the minimum of k(n + 1 — k) over all 1 < k < n equals n. Thus (n!)? > 


m-n-++++n=n". More explicitly, what we are doing is using 
(Ir) (Ilo) = Toe, 2 
k=1 k=1 k=1 


which is established by induction fora: N ~ R and b: N — R, then 
inserting a(k) =k and b(k) =n+1—-—k,1<k<n. 


1.3.22 The inequality (1 + a)" < 1+ (2” — 1)a is clearly true for n = 1. So 
assume it is true for n. Then (1+a)"*1 = (1+a)"(1+a) < (1+(2"-La)(1+ 
a) = 14+ 2"a+4+ (2” —1)a? < 1+ (2"*! — 1)a since 0 < a? < a. Hence it is 
true for all n. The inequality (1 +b)” > 1+ nb is true for n = 1. So suppose 
that it is true for n. Since 1+6 > 0, then (14 b)"*+ = (1+0)"(14+ bd) > 
(14+ nb)(1+6) =1+(n4+1)b+nb? > 1+ (n+ 1)b. Hence the inequality is 
true for all n. 


1.3.23 Start with n = 1. Since |1/a| = a, = cy = [1/2] and [1/2] > 
|1/y] > |1/z], we conclude a, = b; = cy. Now assume the result is true for 
n—1, for allO <a <y<z< 1. Let 2’ be given by x = 1/(a, + 2’) and 
similarly y’, z’. Then apply the n—1 case to a’ = [a2,a3,...], y! = [b2, b3,...], 
2! = [es,ca,...]. We conclude a; = 6; = ¢},.9 = 2,...,%: 


1.3.24 If #X =1, then X = {x} with x nonempty, so there isaexr=UX. 
In this case we define f(x) = a, establishing the case ##X = 1. Assume the 
result is true when #.X = n. If #X = n+1, we may write X = X’U{a} witha 
nonempty and #:X’ = n. By the inductive hypothesis, there is f : X’ > UX’ 
with f(a) € « for x € X’. Since a is nonempty, there is b € a. Now extend f 
to all of X by defining f(a) = b € a. Since UX = UX’ Ua, it follows that 
f:X > UX, this establishes the case n + 1, and hence the result follows by 
induction. 


Solutions to Exercises 1.4 


1.4.1 Since |2| = max(x,—2), « < |az|. Also —a < x < a is equivalent to 
—x <aandz < a and hence to |2| = max(z,—2x) < a. By definition of 
intersection, x > —a and x < a are equivalent to {v: a <a}N{x:a > —a}. 
Similarly, |z| > a is equivalent to x > a or —x > a, i.e., x lies in the union of 
{x: «a> a} and {x: a < —a}. 


1.4.2 Clearly |0|} = 0. If « # 0, then > O or —a% > 0; hence |a| = 
max(z,—x) > 0. If > 0 and y > O, then |xy| = ry = |a|ly|. If c > 0 
and y < 0, then zy is negative, so |xy| = —(ay) = x(—y) = |2||y|, similarly 
for the other two cases. 
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1.4.3 If n = 1, the inequality is true. Assume it is true for n. Then 


lay +--+ +4n + Gn41| < lay +--+ +n] + lan4i| 
< lar] +--+ + lan] + |an+1] 


by the triangle inequality and the inductive hypothesis. Hence it is true for 
n+ 1. By induction, it is true for all naturals n. 


1.4.4 Assume first that a > 1. Let S = {a: 2 >1 and 2? <a}. Sincele S, 
S is nonempty. Also x € S implies « = x1 < x? <a, so S is bounded above. 
Let s = sup S. We claim that s? = a. Indeed, if s? < a, note that 


( ai Ue. fi 
st—) =8%+4+—+5 
nr mr nr 
ee te 

nr 


9  2s+1 
= s+ < 


if (28+ 1)/n < a—8?, ie, ifn > (2s + 1)/(a — 5”). Since s? < a, b = 
(2s + 1)/(a — s?) is a perfectly well defined, positive real. Since sup N = oc, 
such a natural n > b can always be found. This rules out s? < a. If s? > a, 
then b = (s? — a)/2s is positive. Hence there is a natural n satisfying 1/n < b 
which implies s? — 2s/n > a. Hence 


so (by Exercise 1.2.6) s — 1/n is an upper bound for S. This shows that s 
is not the least upper bound, contradicting the definition of s. Thus we are 
forced to conclude that s? = a. Now, if a< 1, then 1/a > 1 and 1/,/1/a are 
a positive square root of a. The square root is unique by Exercise 1.2.6. 


1.4.5 By completing the square, x solves ax? + br + c = 0 iff x solves (2 + 
b/2a)? = (b? —4ac)/4a?. If b?—4ac < 0, this shows that there are no solutions. 
If b? —4ac = 0, this shows that 2 = —b/2a is the only solution. If b?—4ac > 0, 
take the square root of both sides to obtain x + b/2a = +(/b? — 4ac)/2a. 


1.4.6 If 0 < a < b, then a” < 6b” is true for n = 1. If it is true for n, 
then a+! = a™a < b”a < b"b = b"*!. Hence by induction, it is true for 
all n. Hence 0 < a < b implies a” < 6b”. If a” < 6” and a > 5b, then 
by applying the previous, we obtain a” > 6b”, a contradiction. Hence for 
a,b > 0, a < b iff a” < b”. For the second part, assume that a > 1, and 
let S = {2 : 2 > 1 and x” < a}. Since z > 1, 2”! > 1. Hence z = al < 
ca”! = x” < a. Thus s = supS exists. We claim that s” = a. If s” <a, 
then b = s"~1(2" — 1)/(a — s”) is a well-defined, positive real. Choose a 
natural k > b. Then s"—!(2” — 1)/k < a—s”. Hence by Exercise 1.3.22, 
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Hence s + 1/k € S. Hence s is not an upper bound for S, a contradiction. If 
s” >a, b=ns"~'/(s” — a) is a well defined, positive real, so choose k > b. 
By Exercise 1.3.22, 


n 
> 3” 1-4) 
—s ( sk 
e shin 
= s"— a 
k 


Hence by the first part of this exercise, s—1/k is an upper bound for S. This 
shows that s is not the least upper bound, a contradiction. We conclude that 
s” = a. Uniqueness follows from the first part of this exercise. 


1.4.7 If t = k/(nV2) is a rational p/q, then V2 = (kq)/(np) is rational, a 
contradiction. 


1.4.8 Let (b) denote the fractional part of b. If a = 0, the result is clear. If 
a # 0, the fractional parts {(na) :n > 1} of {na:n > 1} are in [0, 1]. Now, 
divide [0,1] into finitely many subintervals, each of length, at most, «. Then 
the fractional parts of at least two terms pa and qa, p # q, p,q € N, must 
lie in the same subinterval. Hence |(pa) — (qa)| < €. Since pa — (pa) € Z, 
qa — (qa) € Z, we obtain |(p — g)a — m| < ¢ for some integer m. Choosing 
n = p—q, we obtain |na — ml < e. 


1.4.9 The key issue here is that |2n?—m?| = n?|f(m/n)| is a nonzero integer, 
since all the roots of f(z) = 0 are irrational. Hence |2n? — m?| > 1. But 
2n? — m? = (nV2 — m)(nV2 +m), so |nV/2 — m| > 1/(nV2 + m). Dividing 
by n, we obtain 


\v2- =| = : (A.1.1) 


J2+ m/n)n? 
Now, if |\/2—m/n| > 1, the result we are trying to show is clear. So let us 
assume that |/2—m/n| < 1. In this case, V2+m/n = 2V2+(m/n— V2) < 
2\/2 + 1. Inserting this in the denominator of the right side of (A.1.1), the 
result follows. 


1.4.10 By Exercise 1.4.5, the (real) roots of f(x) = 0 are ta. The key issue 
here is that |m+— 2m?n? — n4| = n*|f(m/n)| is a nonzero integer, since all 
the roots of f(x) = 0 are irrational. Hence n*|f(m/n)| > 1. But, by factoring, 


f(x) = (2 — a)g(x) with g(x) = (x + a)(x? + /2 — 1), so 
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m 1 
— —| > ———_.. A.1.2 
a | ~ n4g(m/n) ( ) 
Now, there are two cases: If |a—m/n| > 1, we obtain |Ja—m/n| > 1/n* since 
1>1/n*. If |a—m/n| < 1, then 0 < m/n < 3. So from the formula for g, we 
obtain 0 < g(m/n) < 51. Inserting this in the denominator of the right side 


of (A.1.2), the result follows with c = 1/51. 


1.4.11 First, we verify the absolute value properties A,B, and C for x, y € Z. 
Ais clear. For x,y € Z, let x = 2*p and y = 2q with p,q odd. Then ry = 
2I+knq with pq odd, establishing B for 2,y € Z. For C, let i = min(j,k) 
and note x+y = 0 or r+ y = 2'r with r odd. In the first case, |z + 
yl2 = 0, whereas in the second, |x + yl2 = 27*. Hence |x + ylg < 27? = 
max(2—J,2-*) = max(|zlo, |yl2) < |zl2 + |yl2. Now, using B for x,y,z € 
Z, |zt|2 = |z|2|zI/2 and |zy|2 = |z|2|yl2. Hence |zx/zyl2 = |zal2/|zyl2 = 
\zla|alo/|zlolyle = |zlo/lylo = |xw/yl2. Hence | - |2 is well defined on Q. Now, 
using A,B, and C for x,y € Z, one checks A,B, and C for z,y € Q. 


1.4.12 Clearly z = /2—1 satisfies 2 = 1/(2+<); hence by Exercise 1.3.18, 
it cannot be rational. 


1.4.13 Assume « € Q. Then S = {k € N: ka € N} is nonempty, d = min S 
is well defined, and dx € N. Let d’ = d(a—|x]|). Then 0 < d’ < d,d’ € NU{0}, 
and d’x = d(a— |a|)a = dn—da|a| ¢ N. If d’ > 0, then d’ € S contradicting 
d=minS. Hence d’ =OorzeN. 


Solutions to Exercises 1.5 


1.5.1 If (ay) is increasing, {an : n > 1} and {a, : n > N} have the same 
sups, similarly for decreasing. If a, + L, then aj}, \y L. Hence afin \ L 
and Gn. 7 L. Hence ain4N)x / L. We conclude that anzn > L. 


1.5.2 If a, ZA L, then L = sup{an : n> 1}, so —L = inf{—a, : n > 1}, so 
—an \, —L. Similarly, if an \, L, then —an 7 —L. If an — L, then a* \ L, 
so (—@n)x = —a% 7 —L, and anx 7 L, so (—an)* = —aGnx \, —L. Hence 
—An > —L. 


1.5.3 First, if A C Rt, let 1/A = {1/2: x € A}. Then inf(1/A) = 0 implies 
that, for all c > 0, there exists x € A with 1/a < 1/c or x > c. Hence 
sup A = oo. Conversely, sup A = oo implies inf(1/A) = 0. If inf(1/A) > 0, 
then c > 0 is a lower bound for 1/A iff 1/c is an upper bound for A. Hence 
sup A < oo and inf(1/A) = 1/supA. If 1/oo is interpreted as 0, we obtain 
inf(1/A) = 1/supA in all cases. Applying this to A = {a, : k > n} yields 
bn» = 1/ay,,n > 1. Moreover, A is bounded above iff inf(1/A) > 0. Applying 
this to A = {bn : n > 1} yields sup{bns : m > 1} = oo since at, \, 0. Hence 
by, + oo. For the converse, b, — oo implies sup{bn» :n > 1} = co. Hence 
inf{a* :n> 1} =0. Hence a, — 0. 
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1.5.4 Since ky, > n, Qnx < ax, < a. Since a, — L means ay — L and 
Gnx — L, the ordering property implies a,,, + L. Now assume (a,,) is in- 
creasing. Then (a,,,) is increasing. Suppose that a, — D and a,x, > M. 
Since (a,,) is increasing and k, > n, ax, > Gn, n > 1. Hence by ordering, 
M > L. On the other hand, {ax,, : 2 > 1} C {an : n > 1}. Since M and L 
are the sups of these sets, M < L. Hence M = L. 


1.5.5 From the text, we know that all but finitely many a, lie in (L—e, L+e), 
for any € > 0. Choosing « = L shows that all but finitely many terms are 
positive. 


1.5.6 The sequence a, = /n + 1—/n = 1/(/n + 14+,/n) is decreasing. Since 


Gn2 < 1/nhas limit zero, so does (ay). Hence a, = an and dn» = 0 = as =a". 


1.5.7 We do only the case a* finite. Since a*, — a*, a* is finite for each n > N 
beyond some N > 1. Now, a*, = sup{ax, : k > n}, so for each n > 1, we can 
choose ky, > n, such that a*, > ax, > at, —1/n. Then the sequence (az,,) 
lies between (a*) and (a*,—1/n) and hence converges to a*. But (ax,,) may 
not be a subsequence of (a,,) because the sequence (kn) may not be strictly 
increasing. To take care of this, note that, since k, > n, we can choose a 
subsequence (k;,,) of (k,) which is strictly increasing. Then (a, : p = ky,,) is 
a subsequence of (ap) converging to a*, similarly (or multiply by minuses) 
for ax. 


1.5.8 Ifa, 4 L, then by definition, either a* 4 L or a, 4 L. For definiteness, 
suppose that a, #4 L. Then from Exercise 1.5.7, there is a subsequence (ax, ) 
converging to a,. From §1.5, if 2e = |L—a,| > 0, all but finitely many of the 
terms ay, lie in the interval (a, — €,a, +). Since € is chosen to be half the 
distance between a, and L, this implies that these same terms lie outside the 
interval (Z — «, L + €). Hence these terms form a subsequence as requested. 


1.5.9 From Exercise 1.5.7, we know that xz, and «* are limit points. If 
(xz, ) is a subsequence converging to a limit point LD, then since k, > n, 
Lnx < Ly, < 2% for all n > 1. By the ordering property for sequences, taking 
the limit yields x, < DI < a*. 


1.5.10 If z,, > DL, then x, = x* = L. Since x, and x* are the smallest and 
the largest limit points, Z must be the only one. Conversely, if there is only 
one limit point, then xz, = «* since x, and «* are always limit points. 


1.5.11 If M < oo, then for each n > 1, the number M — 1/n is not an upper 
bound for the displayed set. Hence there is an 2, € (a,b) with f(a) > 
M —1/n. Since f(an) < M, we see that f(a) > M, as n 7 oo. If M = 00, 
for each n > 1, the number n is not an upper bound for the displayed set. 
Hence there is an x, € (a,b) with f(a,) >n. Then f(r,) ~ o = M. 
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1.5.12 Note that 


Ae eee ee a> 0. (A.1.3) 


Since e, = 2— V2, e; > 0. By (A.1.3), en41 > 0 as soon as en > 0. Hence 
€n > 0 for all n > 1 by induction. Similarly, (A.1.3) with a = d, plugged in 
and dy, > V2,n>1, yield €n41 < e? /2V/2, n> 1. 
1.5.13 If f(a) = 1/(q+a), then | f(a) — f(b)| < f(a)f(b)|a — b|. This implies 
A. Now, Aimplies |z — x,| < %,|a’ — af, | < apoio” — 2”| <..., where a) 
denotes x, with k layers “peeled off.” Hence 

|e —ap|<analo" 2, n>1. (A.1.4) 
Since gd) = 1/a,, (A.1.4) implies B. For C, note that, since a, > 1, 
x <1/{1+1/(a2 + 1)] = (a2 +:1)/(a2 + 2). Let a = (c+ 1)/(c4+ 2). Now, 
if one of the a,’s is bounded by c, (A.1.4) and C imply |x — 2,| < a, as 
soon as 7 is large enough, since all the factors in (A.1.4) are bounded by 1. 
Similarly, if two of the a,’s are bounded by c, |x — x,| < a?, as soon as n is 
large enough. Continuing in this manner, we obtain D. If a, — oo, we are 
done, by B. If not, then there is a c, such that ax < c for infinitely many n. 
By D, we conclude that the upper and lower limits of (|z — x,|) lie between 
0 and aN, for all N > 1. Since a — 0, the upper and lower limits are 0. 
Hence |# — &| > 0. 


Solutions to Exercises 1.6 


1.6.1 First, suppose that the decimal expansion of x is as advertised. Then 
the fractional part of 10"*™ z is identical to the fractional part of 10x. Hence 
x2(10"*™ — 10™) € Z, or x € Q. Conversely, if c = m/n € Q, perform long 
division to obtain the digits: From Exercise 1.3.9, 10m = nd, +71, obtaining 
the quotient d, and remainder r,. Similarly, 10r; = nd2 + r2, obtaining the 
quotient dz and remainder rz. Similarly, 10rg = nd3 + r3, obtaining d3 and 
rz, and so on. Here d;,d2,... are digits (since 0 < a < 1), and ,719,... 
are zero or naturals less than n. At some point the remainders must start 
repeating and, therefore, the digits also. 


1.6.2 We assume dy > en, and let © = .djd2--- = .e,€2.... If 
y= .did2.. .dn—i(en + 1)00 tihng 


then x > y. If 
z= .e€1€2...e€n99..., 
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then z > a. Since .99--- = 1, z = y. Hence x = y and x = z. Since x = y, 
dy =en +1 and d; = 0 for j > N. Since x = z, e; = 9 for 7 > N. Clearly, 
this happens iff 10% € Z. 


1.6.3 From Exercise 1.3.22, (1+ 6)” >1-+-nb for b > —1. In this inequality, 
replace n by N and b by —1/N(n + 1) to obtain [1 —1/N(n+1)}¥ >1- 
1/(n+1) = n/(n+ 1). By Exercise 1.4.6, we may take Nth roots of both 
sides, yielding A. B follows by multiplying Aby (n+ 1)!/% and rearranging. 
If ay = 1/n'/N, then by B, 


(n ae ie — n/N 


An — An+1 = “RN (ayn > DUN 
> 1 
= N(n+ LOVIN (n + DVN 
1 


> ——————_—. 
=< N(n+1)1+1/N 


Summing over n > 1 yields C. 


1.6.4 Since ey = 2— V2 and eni1 < €2/2V2, e2 < ef /2V/2 = (3 — 2V2)/V2. 
Similarly, 


ex 


a, a 
17 — 12/2 


4/2 
1 1 


———— 
4(17V/2 + 24) ~ 100 


Now, assume the inductive hypothesis en+2 < 10-2". Then 


2 2 
En+2 2 _9” ntl 
C(n41)4+2 = Cnt3 S < €h4o = (10 = 10 3 


2/2 
Thus the inequality is true by induction. 


1.6.5 Since [0,2] = 2[0,1], given z € [0,1], we have to find x € C and 
y € C satisfying « + y = 2z. Let z = .d,dgd3.... Then for all n > 1, 2d, 
is an even integer satisfying 0 < 2d, < 18. Thus there are digits, zero or 
odd (i.e., 0,1,3,5,7,9), di, d’, n > 1, satisfying d), + dl’ = 2d,. Now, set 


x= .didd,... andy =.d{djdy.... 
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Solutions to Exercises 1.7 


1.7.1 Since B is countable, there is a bijection f : B — N. Then f restricted 
to A is a bijection between A and f(A) C N. Thus it is enough to show that 
C = f(A) is countable or finite. If C is finite, we are done, so assume C is 
infinite. Since C CN, let c; = minC, co = minC\ {ci}, c3 = minC \ {c1, co}, 
etc. Then cy) < cg < cg <.... Since C is infinite, C, = C\{c1,...,¢n} is not 
empty, allowing us to set Cn41 = minC), for all n > 1. Since (c,) is strictly 
increasing, we must have c, > n forn > 1. If me C\ {en : n> 1}, then by 
construction, m > c, for all n > 1, which is impossible. Thus C = {c, : n> 
1} and g: N > C given by g(n) = cn, n > 1, is a bijection. 


1.7.2 With each rational r, associate the pair f(r) = (m,n), where r = m/n 
in lowest terms. Then f : Q —- N? is an injection. Since N? is countable and 
an infinite subset of a countable set is countable, so is Q. 


1.7.3If f: N- Aandg: N > Bare bijections, then (f,g): Nx N-> AxB 
is a bijection. 


1.7.4 Suppose that [0,1] is countable, and list the elements as 


ayno= .d41d42 wee 
ag 0 = .dg1da2 eee 
ag 0 = .d31d39 oa 


Let a = .djd2..., where d, is any digit chosen, such that d, 4 dan, n > 1 
(the “diagonal” in the above listing). Then a is not in the list, so the list is 
not complete, a contradiction. Hence [0,1] is not countable. 


1.7.5 Note that i+ 7 =n +1 implies i? + 7° > (n+ 1)3/8 since at least one 
of i or j is > (n + 1)/2. Sum (1.7.6) in the order of N? given in §1.7: 


co 


1 1 
ere | rs 


(m,n)EN? n=1 \i+j=n4+1 


= 8 
<2) 2. wris 


n=1 | itj=nt+1 


— 8n aarp | 
=D gai Slee 
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1.7.6 Since n~* < 1, the geometric series implies 


1 — . —s\ym 
n—1_ = =o ) 
Summing over n > 2, 
ee) 1 lee) lee) 2 
Sa-h (he) 
n=2 n=2 \m=1 
- 3 (See) 
m=1 \n=2 
= > Z(ms). 
m=1 


1.7.7 Let Gy = (-1)"+la, and 6, = (-1)"+1b,. If 2G, is the Cauchy 
product of >a, and >> 6, then 


C= S- asd; = S- (—1)*1(-1)* 10,6, = (iy. 


i+j=n+1 i+j=n+1 


where }>c,, is the Cauchy product of }> a, and > by. 


1.7.8 As in Exercise 1.5.13, |@%m — %m| < @pvixi". ol) form>n>1. 


Since af") = 1/an, this yields |v, — %m| < 1/an, for m > n > 1. Hence if 
Gyn, — 00, (1/a,) is an error sequence for (x,). Now, suppose that a, 4 oo. 
Then there is a c with a, < c for infinitely many n. Hence if N,, is the 
number of a,’s, k < n+ 2, bounded by ec, limy z0 Nn = oo. Since a*) < 
(ap42 + 1)/(ae+2 + 2), the first inequality above implies that 


N. 
+1] nm 
Jen — aml < ($ ) , m>n>1. 


Set a = (c+ 1)/(c + 2). Then in this case, (a) is an error sequence for 
(z,). For the golden mean z, note that « = 1+ 1/2; solving the quadratic, 
we obtain « = (1+ V5)/2. Since x > 0, we must take the +. 
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A.2 Solutions to Chapter 2 


Solutions to Exercises 2.1 


2.1.1 By the theorem, select a subsequence (n,) such that (a,,,) converges 
to some a. Now apply the theorem to (bn), selecting a sub-subsequence 
(nz,,,) Such that (bn,,,) converges to some b. Then clearly (ap,,) and (bn, ) 
converge to a and b, respectively; hence (an, bn) subconverges to (a, b). 


2.1.2 For simplicity, we assume a = 0, b = 1. Let the limiting point be 
L = .djdz2... . By the construction of L, it is a limit point. Since x, is the 
smallest limit point (Exercise 1.5.9), x. < L. Now, note that if t € [0,1] 
satisfies t < a, for all n > 1, then t < any limit point of (a,,). Hence 
t < a,. Since changing finitely many terms of a sequence does not change 
Zz, we conclude that x, >t for all t satisfying t < x, for all but finitely n. 
Now, by construction, there are at most finitely many terms x, < .d,. Hence 
.d, < x,. Similarly, there are finitely many terms x, < .djd2. Hence .djd2 < 
x,. Continuing in this manner, we conclude that .djd2...dy < «,. Letting 
N Ao, we obtain L < x,. Hence x, = L. 


2.1.3 If c € UU, then there is U € U with c € U. Select an open interval 
I containing c that is contained in U. Then (§1.1) 1 Cc U Cc UY, so UU is 
open. 


Solutions to Exercises 2.2 


2.2.1 If lim... f(x) 4 0, there is at least one sequence z, > c with x, 4c, 
n> 1,and f(a,) 4 0. From Exercise 1.5.8, this means there is a subsequence 
(xz, ) and an N > 1, such that |f(xz,,)| >1/N,n > 1. But this means that, 
for all n > 1, the reals x, are rationals with denominators bounded in 
absolute value by N. Hence N!az,,, are integers converging to N!c. But this 
cannot happen unless N!a,,, = N!c from some point on, ie., rz, = c from 
some point on, contradicting x, 4 c for all n > 1. Hence the result follows. 


2.2.2 Let L = inf{f(x) : a < x < b}. We have to show that f(r,) > L 
whenever x, — a+. So suppose that x, — a+, and assume, first, m7, \, a, 
i.e., (Up) is decreasing. Then (f(xp)) is decreasing. Hence f(2,) decreases to 
some limit M. Since f(#,) > LD for alln > 1, M > L. If d > 0, then there 
is an x € (a,b) with f(x) < LD +d. Since x, \ a, there is an n > 1 with 
Yn <u. Hence M < f(an) < f(a) < L+d. Since d > 0 is arbitrary, we 
conclude that M = L or f(a») \ L. In general, if 7, > a+, 7% \y a. Hence 
f(a) \ £L. But tp, < v*; hence L < f(a) < f(a*). So by the ordering 
property, f(z») + L. This establishes the result for the inf. For the sup, 
repeat the reasoning, or apply the first case to g(x) = —f(—2). 
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2.2.3 Assume f is increasing. If a < c < b, then apply the previous exercise 
to f on (c,6), concluding that f(c+) exists. Since f(x) > f(c) for x > c, we 
obtain f(c+) > f(c). Apply the previous exercise to f on (a,c) to conclude 
that f(c—) exists and f(c—) < f(c). Hence f(c) is between f(c—) and f(c+). 
Now, if a < A< B< band there are N points in [A, B] where f jumps by 
at least 6 > 0, then we must have f(B) — f(A) > Nod. Hence given 6 and A, 
B, there are, at most, finitely many such points. Choosing A = a+ 1/n and 
B =b-—1/n and taking the union of all these points over all the cases n > 1, 
we see that there are, at most, countably many points in (a,b) at which the 
jump is at least 6. Now, choosing 6 equal 1, 1/2, 1/3, ..., and taking the 
union of all the cases, we conclude that there are, at most, countably many 
points c € (a,b) at which f(c+) > f(c—). The decreasing case is obtained by 
multiplying by minus. 


2.2.4 This follows from 


If(@)| S IF(@1 + If(e) — f(a) <If(@)|+rs(@,0) asad. 


2.2.5 If f is increasing, there are no absolute values in the variation (2.2.1). 
But f(d) — f(c) < 0 for d > c, so the variation is < F'(b) — F(a). On the 
other hand, taking J; = (a,b) shows F'(b) — F(a) < vpr(a, b). Hence vr(a, b) = 
F'(b) — F(a) < ov. Let f,g be defined on [a,b]. Then v_ f(a, 6) = vyp(a, 6), and 
by the triangle inequality, vs+,(a, b) < v¢(a, 6) + v,4(a, 6). Thus f,g bounded 
variation implies f + g bounded variation. 


2.2.66 Leta<a<y<b. If kk, 1 < k < N, are disjoint open intervals in 
(a,x), then Iy,...,In, (x,y) are disjoint open intervals in (a, y); hence 


N 


do lf (de) = F(ce)| + LFW) — F(@)| S v5 (a, 9). 


k=1 


Taking the sup over all disjoint open intervals in (a,x) yields v(a) + |f(y) — 
f(x)| < u(y). Hence v is increasing and, throwing away the absolute value, 
u(x) — f(x) < v(y) — f(y), so vu — f is increasing. Thus f = vu — (uv — f) 
is the difference of two increasing functions. Interchanging x and y yields 
|u(x) — v(y)| < | f(x) — f(y)|, so the continuity of v follows from that of f. 


2.2.7 Look at the partition x; = 1/(n—1),1<i<n-—1, 29 =0, 2p = 2, and 
select x irrational in (a1, 2;),i =1,...,n. Then | f(a) — f(x))| = 1/(n—-1); 
hence the variation corresponding to this partition is not less than the nth 
partial sum of the harmonic series, which diverges. 


2.2.8 If z, — c and (f(#,)) subconverges to L, by replacing (xp) by a 
subsequence, we may assume (f(,,)) converges to L. Given 6 > 0, since 
Ln >C,C—-0<Xn <c+6 for n sufficiently large. Thus 
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inf f(a) < f(a) < sup f(x), n>>1. 
0<|xz—c|<6 0<|x—c|<6 


Taking the limit n — oo, 


inf (x) <L< sup f(z) 
0<|x—c|<6 0<|a—c|<6 


for all 6 > 0. Now take the supremum over 6 on the left and the infimum 
over 6 on the right to obtain L, < LZ < L*. Thus any limit point FL lies in 
[L., L*]. To show L* is a limit point, for n > 1, select 6, > 0 such that 


1 
I*< — sup f(z) < L'+—. 


0<|x—cl|<6, 


Now select rp in 0 < |z—c| < bn such that suppgjz_cj<s, f(%) < f(@n)+1/n. 
Then 


1 1 
LP SS FO) ee? 
nm n 


hence f(t) > L* as n > oo. Thus L”* is a limit point, similarly for L,. 


Solutions to Exercises 2.3 


2.3.1 Let f be a polynomial with odd degree n and highest-order coeffi- 
cient ag. Since x*/x" > 0, as x —> +00, for n > k, it follows that f(x)/2” > 
ag. Since x” — oo, as x + +00, it follows that f (+00) = +00, at least when 
ao > 0. When ag < 0, the same reasoning leads to f (+00) = Foo. Thus there 
are reals a, b with f(a) > 0 and f(b) < 0. Hence by the intermediate value 
property, there is a c with f(c) = 0. 


2.3.2 By definition of pic, |f(x) — f(c)| < we(d) when |x — cl < 6. Since 
|v —c| < 2|a —c| for 4 c, choosing 6 = 2|x — a] yields |f(x) — f(c)| < 
Le(2|% — cl), for « A c. If w-(0+) F 0, setting € = pue(O+)/2, pe(1/n) > 2e. 
By the definition of the sup in the definition of z., this implies that, for each 
n > 1, there is an vp € (a,b) with |v, — cl < 1/n and |f(an) — f(c)| > «. 
Since the sequence (x,,) converges to cand f(a,) 4 f(c), it follows that f is 
not continuous at c. This shows A implies B. 


2.3.3 Let A = f((a,b)). To show that A is an interval, it is enough to show 
that (inf A,sup A) C A. By definition of inf and sup, there are sequences 
My, — inf A and M, — sup A with m, € A, M, € A. Hence there are reals 
Cn, dn With f(cn) = mn, f(dn) = Mn. Since f([cn, dn]) is a compact interval, 
it follows that [m,,.M,] C A for all n > 1. Hence (inf A,sup A) C A. For the 
second part, if f((a, b)) is not an open interval, then there is a c € (a,b) with 
f(c) a max or a min. But this cannot happen: since f is strictly monotone, 
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we can always find an z and y to the right and to the left of c such that f(z) 
and f(y) are larger and smaller than f(c). Thus f((a,b)) is an open interval. 


2.3.4 Let a = sup A. Since a > x for  € A and f is increasing, f(a) > f(z) 
for x € A, or f(a) is an upper bound for f(A). Since a = sup A, there is a 
sequence (t,) C A with x, > a (Theorem 1.5.4). By continuity, f(an) > 
f(a). Now, let M be any upper bound for f(A). Then M > f(a), n > 1. 
Hence M > f(a). Thus f(a) is the least upper bound for f(A), similarly for 
inf. 


2.3.5 Let yn = f(xn). From Exercise 2.3.4, f(x*) = sup{f(an) :k > n} = 
y*, 2 > 1. Since x* — x* and f is continuous, y> = f(x*) > f(x*). Thus 
y* = f(a*), similarly for lower stars. 


2.3.6 Remember 2” is defined as («™)!/" when r = m/n. Since 
lcat/my) a (gh/Pymn = (a) _ z™, 


a” = (x/™)™ also. With r = m/n € Q and p in Z, (27)? = [(2/")™]? = 
(al/rymp — gmp/n — xT? Now, let r = m/n and s = p/q with m,n, p,q in- 
tegers and nq # 0. Then [(x”)*]"? = (#")9"4 = 2794 = g™?, By the unique- 
ness of roots, (2")* = (a™P)!/"¢ = gf*, Similarly, (2"2°)"? = a7™4a°™4 = 
grateng — 2(rts)ng — (¢r+s)\"q, By the uniqueness of roots, «2° = a’ +. 


2.3.7 We are given a? = sup{a" : 0 < r < b,r € Q}, and we need to show 
that a? = c, where c = inf{a® : s > b,s € Q}. If r,s are rationals with 
r<b<s, then a” <a*. Taking the sup over all r < b yields a? < a’. Taking 
the inf over all s > b implies a? < c. On the other hand, choose r < b < s 
rational with s—r <1/n. Thence <a’ < atall® < gbgi/. Taking the limit 
as n_/“ 00, we obtain c < a’. 


2.3.8 In this solution, r,s, and t denote rationals. Given 6, let (r,) be a 
sequence of rationals with r,, > b—. If t < bc, then t < rnc for n large. Pick 
one such r,, and call it r. Then s = t/r <candt=rs. Thus t < bc iff t is of 
the form rs with r < b and s < c. By Exercise 2.3.6, (a°)* = sup{(a’)* :0 < 
s<c}=sup{(sup{a”:0<r<b})§:0<s<ch=sup{a’:0<r<b0< 
s<el—supfa’:0<t<bel — a". 


2.3.9 Since f(a +2’) = f(x) f(a’), by induction, we obtain f(nx) = f(x)”. 
Hence f(n) = f(1)” = a” for n natural. Also a = f(1) = f(1+0) = 
f(1)f(0) = af (0). Hence f(0) = 1. Since 1 = f(n—1n) = f(n)f(—n) = 
a” f(—n), we obtain f(—n) = a~” for n natural. Hence f(n) = a” for n € Z. 
Now, (f(m/n))" = f(n(m/n)) = f(m) = a™, so by the uniqueness of roots, 
f(m/n) = a”. Hence f(r) = a” for r € Q. If x is real, choose rationals 
Tm} — «x. Then a™ = f(r,) > f(x). Since we know that a® is continuous, 
a™ — a”. Hence f(x) =a”. 
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2.3.10 Given € > 0, we seek 5 > 0, such that |x—1| < 6 implies |(1/a2)—1| < e. 
By the triangle inequality, |z| = |(@ — 1) +1| > 1—-|a—- 1]. So 6 < 1 and 
|x — 1| < 6 imply |z| > 1—6 and 


—-1 < 5 
x || 1-6 


1 |-=— 6 


Solving 6/(1 — 6) = €, we have found a 6, 6 = €/(1 + €), satisfying 0<6<1 
and the e-d criterion. 


2.3.11 Let A, be the set of all real roots of all polynomials with degree d 
and rational coefficients ag, a1, ..., da, with denominators bounded by n and 
satisfying 

lao| + jar] +---+|aal/+d<n. 


Since each polynomial has finitely many roots and there are finitely many 
polynomials involved here, for each n, the set A, is finite. But the set of 
algebraic numbers is LU, An; hence it is countable. 


2.3.12 Let b be a root of f, b € a, and let f(x) = (x — b)g(za). If b is 
rational, then the coefficients of g are necessarily rational (this follows from 
the construction of g in the text) and 0 = f(a) = (a—b)g(a). Hence g(a) = 0. 
But the degree of g is less than the degree of f. This contradiction shows that 
b is irrational. 


2.3.13 Write f(x) = (w —a)g(x). By the previous exercise, f(m/n) is never 
zero. Since n“f(m/n) is an integer, 


n“|f(m/n)| = n*|m/n — al |g(m/n)| > 1, 


or 
1 


n4|g(m/n)|_ 
Since g is continuous at a, choose 6 > 0 such that (6) < 1, i-e., such that 
|c—a| < 6 implies |g(x)—g(a)| < pa(d) < 1. Then |x—a| < 6 implies |g(x)| < 
\g(a)| + 1. Now, we have two cases: either |a— m/n| > 6 or |a— m/n| < 0. 
In the first case, we obtain the required inequality with c = 6. In the second 


case, we obtain |g(m/n)| < |g(a)| +1. Inserting this in the denominator of 
the right side of (A.2.1) yields 


m 
a-—|> 
mr 


(A.2.1) 


8) aah 
n!~ n4(|g(a)| + 1) 


which is the required inequality with c = 1/||g(a)| + 1]. Now, let 


c= min (. ——). 


Then in either case, the required inequality holds with this choice of c. 
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2.3.14 Let a be the displayed real. Given a natural d, we show that a is not 
algebraic of order d. This means, given any € > 0, we have to find M/N such 
that 

M € 


a— 


Thus the goal is as follows: given any d > 1 and € > 0, find M/N satisfying 
(A.2.2). 

The simplest choice is to take M/N equal to the nth partial sum s,, in the 
series for a. In this case, N = 10”'. The question is, which n? To figure this 
out, note that k! > n!k when k >n-+1, and 10™ > 2, so 


1 1 
|a — Sn] = oS 108! = PS [Qk 
k=n+1 k=n+1 
ol “1 _ 2 
_ (10"!)n+1 = (1 Qn!)k any sneer a _ a Ton!" 


hence s, = M/N satisfies (A.2.2). 


2.3.15 From §1.6, we know that 57 n~" converges when r = 1+ 1/N. Given 
s > 1 real, we can always choose N with 1+1/N < s. The result follows by 
comparison. 


2.3.16 To show that pl0%¢ = cla, apply log, obtaining log, blog, ¢ from 
either side. For the second part, >> 1/5!°88" = > 1/n!°8s° which converges 
since log; 5 > 1. 


2.3.17 Such an example cannot be continuous by the results of §2.3. Let 
f(a) =a#+4+1/2,0< a < 1/2, and f(x) =x-1/2,1/2<a<1, f)= 
Then f is a bijection, hence invertible. 


2.3.18 First, assume f is increasing. Then by Exercise 2.2.3, f(c—) = f(c) = 
f(c+) for all except, at most, countably many points c € (a,b), where there 
are at worst jumps. Hence f is continuous for all but at most countably many 
points at which there are at worst jumps. If f is of bounded variation, then 
f = g—hwith g and h bounded increasing. But then f is continuous wherever 
both g and h are continuous. Thus the set of discontinuities of f is, at most, 
countable with the discontinuities at worst jumps. 


2.3.19 Let M = sup f((a,b)). Then M > —oo. By Theorem 1.5.4, there is 
a sequence (,,) in (a,b) satisfying f(a,) + M. Now, from Theorem 2.1.2, 
(xn) subconverges to some wx in (a,b) or to a or to b. If (_) subconverges to 
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a or b, then (f(x,,)) subconverges to f(a+) or f(b—), respectively. But this 
cannot happen since M > f(a+) and M > f(b—). Hence (x,,) subconverges 
to some a < x < b. Since f is continuous, (f(xz,,)) must subconverge to f(z). 
Hence f(#) = M. This shows that M is finite and M is a max. 


2.3.20 Fix y and set h(a) = ay — f(a). Then by superlinearity 


_ _ fe) 

hoo) = Jim (ey = f(e2)) = Jim (y— 222) = oo 
similarly h(—co) = —oo. Thus Exercise 2.3.19 applies and the sup is attained 
in the definition of g(y). Now, for x > 0 fixed and yn > 00, g(yn) > FYn — 


f(a). Hence 
alin), fle) 
Yn Yn 
It follows that the lower limit of (g(Yn)/yn) is > x. Since x is arbitrary, it 
follows that the lower limit is co. Hence g(yn)/Yn — oo. Since (Yn) was any 
sequence converging to oo, we conclude that limy_... g(y)/|y| = oo. Similarly, 
limy-+—oo g(y)/|y| = 00. Thus g is superlinear. 


2.3.21 Suppose that y, — y. We want to show that g(yn) > g(y). Let 
L* > L,, be the upper and lower limits of (g(yn)). For all z, g(Yn) > 2yn—f(). 
Hence L, > zy — f(z). Since z is arbitrary, taking the sup over all z, we 
obtain L, > g(y). For the reverse inequality, let (y/,) be a subsequence of 
(yn) satisfying g(y},) > L*. Pick, for each n > 1, 2, with g(y/,) = vi,yi, — 
f(ai,). From §2.1, (a/,) subconverges to some 2, possibly infinite. If « = 
too, then superlinearity (see previous solution) implies the subconvergence 
of (g(y/,)) to —oo. But L* > L. > g(y) > —ov, so this cannot happen. Thus 
(x},) subconverges to a finite x. Hence by continuity, g(y/,) = zi,y!, — f(z) 


subconverges to zy—f (x) which is < g(y). Since by construction, g(yj,) > L*, 
this shows that L* < g(y). Hence g(y) < Ls < L* < g(y) or g(¥n) > gly). 


2.3.22 Note that 0 < f(x) < 1 and f(a) = 1 iff a € Z. We are supposed 
to take the limit in m™ first, then n. If x € Q, then there is an N € N, 
such that nla € Z for n > N. Hence f(n!a) = 1 for n > N. For such an 
L, liMm-—yoolf(nta)|™ = 1, for every n > N. Hence the double limit is 1 for 
x€Q. Ifa ¢Q, then nla ¢ Q, so f(nlx) < 1 and so [f(n!ax)]™ — 0, as 
m /\ oo, for every n > 1. Hence the double limit is 0 for « ¢ Q. 


2.3.23 In the definition of j1-(5), we are to maximize |(1/x) — (1/c)| over all 
x € (0,1) satisfying |a — c| < 6 or c— 6 < a < c+ 0. In the first case, if 
6 > c, then c— 6 < 0. Hence all points x near and to the right of 0 satisfy 
|x — c| < 6. Since lim,_,94(1/x) = ov, in this case, u.(6) = oo. In the second 
case, if 0 <6 <c, then x varies between c— 6 > 0 and c+ 06. Hence 


1 oil 
LC 


_ |x — c| 


XC 
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is largest when the numerator is largest (|a —c| = 6) and the denominator is 
smallest (« = c— 06). Thus 


(6) = 5/(c? — c6), 0<d<e, 
ea oO, 6>c. 


Now, fr(6) equals the sup of y.(d) for all c € (0,1). But, for 6 fixed and 
c— 0+, 6 > ceventually. Hence j17(5) = 00 for all 6 > 0. Hence z(0+) = ov, 
or f is not uniformly continuous on (0, 1). 


2.3.24 Follow the proof of the uniform continuity theorem. If (0+) > 0, set 
€ = .(0+)/2. Then since yu is increasing, (1/n) > 2e for all n > 1. Hence for 
each n > 1, by the definition of the sup in the definition of j(1/n), there is a 
Cn € R with pie, (1/n) > €. Now, by the definition of the sup in fic, (1/n), for 
each n > 1, there is an 2, € R with |x, —c,| < 1/n and |f (an) — f(en)| > €. 
Then by compactness (§2.1), (w,) subconverges to some real x or to x = too. 
It follows that (c,) subconverges to the same x. Hence € < |f (a) — f(cn)| 
subconverges to | f(a) — f(a)| = 0, a contradiction. 


2.3.25 If Oh is rational, we are done. Otherwise a = a is irrational. 
In this case, let b = /2. Then a? = (V2? )v2 = is = 2 is rational. In fact 
hd is transcendental, a result due to Gelfond. 
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Solutions to Exercises 3.1 


3.1.1 Since a > 0, f(0) = 0. If a = 1, we already know that f is not 
differentiable at 0. If a < 1, then g(x) = (f(x) — f(0))/(@ — 0) = |a|*/a 
satisfies |g(a)| > co as « > 0, so f is not differentiable at 0. If a > 1, then 
g(x) + 0 as 2-0. Hence f’(0) = 0. 


3.1.2 Since V2 is irrational, f(/2) = 0. Hence (f(x) — f(V2))/(@ — V2) = 
f(x)/(a — V2). Now, if x is irrational, this expression vanishes, whereas if x 
is rational with denominator d, 


fe) = fO=1V) _ 
~  g-V2 d3(a — 2) 


By Exercise 1.4.9 it seems that the limit will be zero, as  — 2. To prove 
this, suppose that the limit of q(x) is not zero, as x —> /2. Then there exists 
at least one sequence x, > V2 with q(an) *% 0. It follows that there is a 
6 > 0 and a sequence xz, + V2 with |q(z,)| > 6. But this implies all the 
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reals x, are rational. If d, is the denominator of z,, n > 1, we obtain, from 
Exercise 1.4.9, 


Since d, — co (Exercise 2.2.1), we conclude that q(#,) — 0, contradicting 
our assumption. Hence our assumption must be false, i.e., lim, , 3 ¢(2) = 0 


or f’(V2) =0. 


3.1.3 Since f is superlinear (Exercise 2.3.20), g(y) is finite, and the max is 
attained at some critical point x. Differentiating ry — ax?/2 with respect to 
x yields 0 = y — ax, or x = y/a for the critical point, which, as previously 
said, must be the global max. Hence 


g(y) = (y/a)y — a(y/a)?/2 = y?/2a. 


Since f’(x) = ax and g'(y) = y/a, it is clear they are inverses. 


3.1.4 Suppose that g'(R) is bounded above. Then g’(z) < c for all x. Hence 
g(a) — g(0) = g'(z)(a — 0) < cx for x > 0, which implies g(x)/x < c for 
x > 0, which contradicts superlinearity. Hence g’(R) is not bounded above. 
Similarly, g’(R) is not bounded below. 


3.1.5 To show that f’(c) exists, let x, — c with xz, 4 c for all n > 1. 
Then for each n > 1, there is a y, strictly between c and 2, such that 
f(tn) — flo) = f' (yn) (tn — c). Since tp > C, Yn > c, and yn # c for all 
n > 1. Since lim,-,. f’(x) = L, it follows that f’(yn) > LD. Hence (f(a) — 
f(o))/(a@n — ©) > L. Since (a) was arbitrary, we conclude that f’(c) = 


3.1.6 To show that f’(c) = f’(c+), let a, — c+. Then for each n 
there is a y, between c and vp, such that f(r.) — f(c) = f’(yn) (an 
Since %, + c+, Yn > c+. It follows that f’(y,) > f’(c+). Hence (f(xn) 
f(c))/(an — 0) > f'(ct), ie, f’(e) = f’(c4), similarly for f’(c—). 


3.1.7 Ifa = % < 4 < to << ++: < @ = Dis a partition, the mean value 
theorem says f(xr,) — f(vp_1) = f' (Zk) (@e — Ve_-1) for some zz between rp_1 
and x,, 1< k <n. Since |f’(a)| < I, we obtain |f(r,) — f(ax—1)| < T(ae — 
L,-1). Summing over 1 < k < n, we see that the variation corresponding 
to this partition is < I(b — a). Since the partition was arbitrary, the result 
follows. 


PIV Ow 


1, 
C). 


3.1.8 Let c € Q. We have to show that, for some n > 1, f(c) > f(x) for all 
x in (c—1/n,c+1/n). If this were not the case, for each n > 1, we can find 
areal x, satisfying |x, —c| <1/n and f(x,) > f(c). But, by Exercise 2.2.1, 
we know that f(z,) > 0 since wz, > c and x, # c, contradicting f(c) > 0. 
Hence c must be a local maximum. 


3.1.9 If f is even, then f(—x) = f(x). Differentiating yields —f’(—a) = f’(x), 
or f’ is odd, similarly if f is odd. 
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3.1.10 Let g(x) = (f(x) — f(r))/(a@— 1), « Ar, and g(r) = f’(r). Then g is 
continuous and f(r) = 0 iff f(a) = (a — r)g(za). 


3.1.11 As in the previous exercise, set 


Ff (2) 
g(z) = — 2) 
[Tj-1(¢ —1;) 
Then g is continuous away from r1,...,7ra. If f(r;) = 0, then 
(75) 
lim g(z) = ——-~>_.. 
Bor ILjgc(ri — 75) 


Since g has removable singularities at r;, g can be extended to be contin- 
uous there. With this extension, we have f(z) = (w —11)...(@ — ra)g(a). 
Conversely, if f(x) = (a—11)...(@—ra)g(x), then f(r;) =0. 

3.1.12 If a < b and f(a) = f(b) = 0, then by the mean value theorem, there 


is ac € (a,b) satisfying f’(c) = 0. Thus between any two roots of f, there is 
a root of f’. 


Solutions to Exercises 3.2 


3.2.1 Let f(x) = e”. Since f(x) — f(0) = f’(c)x for some 0 < c < 2, we have 
e* —1= ex. Since c > 0, e® > 1. Hence e* —1> 2 for x > 0. 


3.2.2 Let f(x) = 1+ 2%, g(x) = (1+ 2)%. Then f’(c)/g/(c) = (c/(1 + 
c))*-! > 1; hence (f(x) — f(0))/(g(x) — g(0)) > 1. For the second part, 
(a + b)* = a%(1 +4 (b/a))* < a®(1 + (b/a)*) = a® + b%. 

3.2.3 If f(x) = log(1 + x), then by l’Hépital’s rule lim,_,9 log(1 + x)/x = 


lim,-,0 1/(1+2) = 1. This proves the first limit. If a 4 0, set x, = a/n. Then 
Lp > 0 with x, #0 for all n > 1. Hence 


log(1 Fi 
lim nlog(1 + a/n) =a lim log( + tn) +n) — 
noo noo Ln 

By taking exponentials, we obtain limy »..(1 + a/n)” = e* when a F 0. If 
a = 0, this is immediate, so the second limit is true for all a. Now, if a, > a, 
then for some N > 1,a—€ <a, <a-+e for alln > N. Hence 


n n nm 
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for n > N. Thus the upper and lower limits of the sequence in the middle lie 
between e and e***. Since € > 0 is arbitrary, the upper and lower limits 
must be both equal e*. Hence we obtain the third limit. 


3.2.4 Let b= 2/(n +1), v = (n+1)/n, and e, = (1+ 2/n)". Then |b] < 1, 
so by (3.2.2), (1+ 6)" > (1+ vb)”. Hence en+1 > en. 

3.2.5 If f(x) = (1+-47)-'/?, then f(x) — f(0) = f’(c)(a—0) with0<c<-z. 
Since f’(c) = —c(1+ c*)-3/? > —c > —2 for x > 0, 


a—e 


1 
V1l+ 22 7 


which implies the result. 


3.2.6 With f(t) = —t~* and f’(t) = vt~*~1, the left side of the displayed 
inequality equals f(2j) — f(2j7 — 1), which equals f’(c) with 27 --1<c< 2). 
Hence the left side is re~?~! < a(2j7 —1)~771. 


1= f'(c)e > —2?, 


3.2.7 Let g(x) = f(x)?. Then g'(x) < 29(x), so e~?"g(x) has nonpositive 
derivative and e~?"g(x) < g(0) = 0. Thus f is identically zero. 


3.2.8 First, suppose that L = 0. If there is no such x, then f’ is never zero. 
Hence f’ > 0 on (a,6) or f’ < 0 on (a,b), contradicting f’(c) < 0 < f’(d). 
Hence there is an x satisfying f’(x) = 0. In general, let g(x) = f(a) — La. 
Then f’(c) < L < f’(d) implies g'(c) < 0 < g’(d), so the general case follows 
from the case L = 0. 


3.2.9 From Exercise 3.1.4, g’(R) is not bounded above nor below. But g’ 
satisfies the intermediate value property. Hence the range of g’ is an interval. 
Hence g/(R) = R. 


3.2.10 Note first fa(1) = 1 and 


d—1\? _ io d—1\? 
f(t) = (* ) ida cee (= ) , #21 


By the mean value theorem, 


fult) —1 = fat) — fat) < (S54) =), 02 


Solutions to Exercises 3.3 


3.3.1 Since f’(x) = (1/2)—(1/2), the only positive critical point (Figure A.1) 
is « = 2. Moreover, f(oo) = f(0+) = cv, so V2 is a global minimum over 
(0,00), and f(/2) = V2. Also f” (x) = 2/z° > 0, so f is convex. 
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3.3.2 Ifa<a<y <b, then 


f(i-—t)a+ty) = (1—-t)f(x) +tf(y), 0<t<1. (A.3.1) 


Fig. A.1 The graph of (x + 2/x)/2 


Differentiate with respect to t to get 


Ply) = P(e) 


Y-a@ 


f(A —t)x + ty) = Ot <1, (A.3.2) 


Thus f’ is constant on (x,y) and hence on (a,b). Conversely, suppose f’ is a 
constant m. Then by the MVT, f(y) — f(x) = m(y—<), hence (A.3.2) holds, 
and thus d 


£ (1—te +t) =F) fla), O<t<1, 
which implies 

f(A — the + ty) = f(a) +tfy)-f@)), O<t<1. 
But this is (A.3.1). 
3.3.3 Ifa<a<y<band0<t<1, then f((1—t)r+ty) < 1—-df(2) + 
tf(y), so 


g(f((1 — tha + ty)) < g((1 — t) f(x) + tfy)) < A — tof) + to( Fy). 


3.3.4 If a<b<candt = (b—a)/(c—a), then b = (1—t)a+ tc. Hence by 
convexity, 


f(0) < 1 —t)fl@) + tf(e). 


Subtracting f(a) from both sides, and then dividing by b— a, yields s[a, b] < 
s[a, c]. Instead, if we subtract both sides from f(c) and then divide by c— 8, 
we obtain s[a,c] < s[b, cl. 


3.3.5 Exercise 3.3.4 says x +> s|c, 2] is an increasing function of x. Hence 


igs lim, s[e, i) =int{elc,t)it> ec} x slog), xoe 
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exists. Similarly 


f(a) = Jim s|t, d] = sup{s{t,d]:t<d}>s[z,d], «<d 


exists. Since s[c, xz] < s[x,d] by Exercise 3.3.4, (3.3.12) follows. Also since 
sly,c] < s|c,z] for y < c < a, inserting c = d in the last two inequalities, 
we conclude that f’ (c) < f.(c). Moreover, since t < x < s < y implies 
s|t, 2] < s[s,y], let t > w— and s > y— to get f_(x) < f_(y); hence f_ is 
increasing, similarly for f,. 


3.3.6 The inequality (3.3.12) implies f! (c) < s[c,a] < fl (d). Multiplying 
this inequality by (a — c) and letting « > c+ yield f(c+) = f(c). Similarly 
multiplying f{(c) < s[x,d] < f!(d) by (a — d) and letting x — d— yield 
f(d—) = f(d). Since c and dare any reals in (a, b), we conclude f is continuous 
on (a, 6). 


3.3.7 Multiply f%(c) < s[c, 2] by (a —c) for « > c and rearrange to get 
fi@)>fO+f(l@-0, «> 

Since f'(c) > f! (c), this implies 
f(@)>f+fLO@-c), «r20 

Similarly, multiply f’ (c) > s[y,¢] by (y—©) for y < c and rearrange to get 
fMZzFO+HLOW-O, ye 

Since f'.(c) > f(c) and y—c < 0, this implies 
fM2hQO+hOW-O, ye. 


Thus 
f(@)>flO+f(Qw-0, a<aK<b. 
If f is differentiable at c, the second inequality follows since f’ (c) = f’(c) = 


FL (e). 
3.3.8 If p is a subdifferential of f at c, rearranging the inequality 


f(x) > f(c) + p(x — c), a<az<b, 
yields 


L—-C y-—c 


for y<c< «a. Letting « + c+ and y > c—, we conclude 


fle) Sp file). 
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Conversely, assume f is convex; then f4.(c) exist and are subdifferentials of 
f at c. If > cand f! (c) <p< fi (c), we have 


f(xe)>fO+f(O(e-e)>f()t+p(ex-—c), cx<a<b. 


Similarly, if « < c, we have 


f(z) 2 f+ fL(\@-e) 2 fle)t+pe-c), a<aKde 
Hence p is a subdifferential of f at c. 


3.3.9 If c is a maximum of f, then f(c) > f(a) fora <a <b. Let pbea 
subdifferential of f at c: f(x) > f(c)+ p(a@ — c) for a < x < b. Combining 
these inequalities yields f(x) > f(x)+p(a—c) or 0> p(a—c) fora<a<b. 
Hence p = 0; hence f(x) > f(c) from the subdifferential inequality. Thus 
f(z) = f(c) fora<a<b. 


3.3.10 We are given that f(c) — g(c) > f(x) — g(a) for alla < a < b. Let p 
be a subdifferential of f at c: f(x) > f(c)+p(e@—c), a< « < b. Combining 
these inequalities yields g(x) > g(c) + p(x — c) or p is a subdifferential of g 
at c. Hence p = g'(c) by Exercise 3.3.8. Hence f has a unique subdifferential 
at c. Hence by Exercise 3.3.8 again, f is differentiable at c and f’(c) = g’(c). 


3.3.11 Since f;, 7 =1,...,n, is convex, we have 


f(A —t)e+ty) < (1—-t)fi(x) +t) < A-Hf(@)+tfy), O<t<1, 


for each 7 = 1,...,n. Maximizing the left side over 7 = 1,...,n, the result 
follows. 


3.3.12 The sup in (3.3.10) is attained by Exercise 2.3.20; since by Exer- 
cise 2.3.21, g is continuous, the same reasoning applies to (3.3.11); hence the 
sup is attained there as well. Fix a << band0<t<1. Then 


a[(1—t)a+tb]— f() = (1—t)[aa— f(x)]+t[ab— f(@)] < 1—t)g(a) +tg(). 


Since this is true for every x, taking the sup of the left side over x yields the 
convexity of g. Now, fix y, and suppose that g(y) = zy — f(z), ie., x attains 
the max in the definition (3.3.10) of g(y). Since g(z) > xz — f(a) for all z, 
we get 


g(z) 2 wz — f(x) = az — (xy — gly)) = gly) + az y) 
for all z. This shows « is a subdifferential of g at y. 


3.3.13 Since x + —z is a bijection and f is even, 


g(-y) = sup (a(-y) — F(2)) 


—co<@r<oo 
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sup ((—2)(-y) — f(-2)) 
—oo<4%<oo 


sup (ry — f(x)) =g(y). 


—co<xr<oo 


Thus g is even. If y > 0 and x > 0, then xy — f(a) > (—«x)y — f(x), so when 
y = 0, the sup can be restricted over x > 0. 


3.3.14 Since p > 1, f is superlinear, so the max exists. By Exercise 3.3.13, 
we need to consider only y > 0 and consequently only « > 0 when maximizing 
xy — f(x). If y = 0, we obtain g(0) = 0, whereas if y > 0, we need to consider 
only x > 0 since f(x) > 0. To find the critical points, solve 0 = (xy— f(x))! = 
y — f'(x) for x > 0 obtaining #?-! = y or « = y'/-., Plugging this into 
xy — f(x) yields the required g, using (p — 1)(q— 1) = 1 and p(q— 1) = 4. 
Finally, f’ and g’ are odd, and, for x > 0, f’(x) = x?~! and g'(y) = y%} are 
inverses since (p — 1)(q—1) =1. 


3.3.15 Again, it is enough to restrict to y > 0 and a > 0. Also (ay — 
f(z)) = 0 iff y = f’(a), ie, y = e*. Thus x > 0 is a critical point if y > 1 
and « = logy, which gives xy — f(x) = ylogy—y+1.If 0 < y < 1, the 
function « + xy— f(x) has no critical points in (0, co). Hence it is maximized 
at x = 0, ie., g(y) =0 when 0 < y <1. If y > 1, we obtain the critical point 
x = logy, the corresponding critical value ylogy — y+ 1, and the endpoint 
values 0 and —oo. To see which of these three values is largest, note that 
(ylogy—yt+1) = logy > 0 for y > 1, and thus ylogy—y+1> 0 for y >1. 
Hence g(y) = ylogy —y+1 for y > 1 and g(y) = 0 for0O<y<1. 


3.3.16 Since f is convex, it is continuous (Exercise 3.3.6). Thus by Ex- 
ercise 2.3.20, g is well defined and superlinear. By Exercise 3.3.12, g is 
convex. It remains to derive the formula for f(a). To see this, note, by the 
formula for g(y), that f(a) + g(y) > xy for all x and all y, which implies 
f(z) > max,[ry — g(y)]. To obtain equality, we need to show the following: 
For each z, there is a y satisfying f(x) + g(y) = xy. To this end, fix x; by 
Exercise 3.3.8, f has a subdifferential p at x. Hence f(t) > f(x)+p(t—2) for 
all t which yields xp > f(x) + (pt — f(t)). Taking the sup over all t, we obtain 
xp > f(x) + g(p). Since we already know f(x) + g(p) > xp by the definition 
(3.3.10) of g, we conclude f(x) + g(p) = xp. Hence f(x) = max,(xy — g(y)), 
i.e., (3.3.11) holds. Note that when f is the Legendre transform of g, then f 
is necessarily convex; hence if f is not convex, the result cannot possibly be 
true. 


3.3.17 The only if part was carried out in Exercise 3.3.12. Now fix y and 
suppose x is a subdifferential of g at y. Then g(z) > g(y) +.2(z—y) for all z. 
This implies zy > g(y)+(%z—g(z)) for all z. Maximizing over z and appealing 
to Exercise 3.3.16, we obtain zy > g(y) + f(a). Since we already know by 
the definition (3.3.10) of g that xy < g(y)+ f(x), we conclude « achieves the 
maximum in the definition of g. For a counterexample for non-convex f, let 
f(x) =1-—2? for |z| < 1, f(x) = x? —-1 for |x| > 1. Then f is superlinear and 
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continuous, so its Legendre transform g is well defined and convex. In fact 
g(y) = lyl|, yl < 2, gy) =14+ y?/4, |y| > 2. The set of subdifferentials of g 
at y = 0 is [—1,1], while x attains the max in (3.3.10) for g(0) iff x = +1. 
It may help to graph f and g. 


3.3.18 Since f is convex, f’ is increasing. By Exercise 2.2.3, this implies 
f’ can only have jump discontinuities. By Exercise 3.2.8, f’ satisfies the 
intermediate value property; hence it cannot have jump discontinuities. Hence 
f’ is continuous. 


3.3.19 Fix y and suppose the maximum in the definition (3.3.10) of g(y) is 
attained at x1 and x2. By strict convexity of f, if « = (a1 + x2)/2, we have 


a(y) = 59(y) + 59(u) = 5 (ey — Fler) + 5 (eeu — fle2)) < ay - f(@), 
contradicting the definition of g(y). Thus there can only be one real x at 
which the sup is attained; hence by Exercise 3.3.17, there is a unique sub- 
differential of g at y. By Exercise 3.3.8, this shows g', (y) = g!_(y); hence g 
is differentiable at y. Since g is convex by Exercise 3.3.12, we conclude g’ is 
continuous by Exercise 3.3.18. 


3.3.20 We already know g is superlinear, differentiable, and convex, and 
sy = f(x) + g(y) iff # attains the maximum in the definition (3.3.10) of 
g(y) iff x is a subdifferential of g at y iff « = g'(y). Similarly, since f is 
the Legendre transform of g, we know ry = f(x) + g(y) iff y attains the 
maximum in (3.3.11) iff y is a subdifferential of f at x iff y = f’(x). Thus f’ 
is the inverse of g’. By the inverse function theorem for continuous functions 
§2.3, it follows that g’ is strictly increasing; hence g is strictly convex. 


3.3.21 Here f’ does not exist at 0. However, the previous exercise suggests 
that g’ is trying to be the inverse of f’ which suggests that f’(0) should be 
defined to be (Figure A.2) the line segment [—1, 1] on the vertical axis. Of 
course, with such a definition, f’ is no longer a function, but something more 
general (f’(0) is the set of subdifferentials at 0; see Exercise 3.3.8). 


3.3.22 Since (e”)” > 0, e® is convex. Hence e* 8 ¢+(1—#) logb < teloge 4 (1 — 
t)e'°S°. But this simplifies to a’b'—' < ta + (1—t)b. 


3.3.23 By Exercise 3.3.20, we know g is superlinear, differentiable, and 
strictly convex, with g'(f’(a)) = « for all x. If g’ is differentiable, differ- 
entiating yields g’[f’(x)|f” (a) =1, so f” (x) never vanishes. Since convexity 
implies f”(x) > 0, we obtain f” (a) > 0 for all x. Conversely, if f” (a) > 0, by 
the inverse function theorem for derivatives §3.2, g is twice differentiable with 
g(x) = (9) (2) = 1/(f)' [9 (x)] = 1/f" [g'(x)]. Hence g is twice differentiable 


and 
1 


9 (@) = Fagrayy 
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Fig. A.2 The graphs of f, g, f’, g’ (Exercise 3.3.21) 


Since f is smooth, whenever g is n times differentiable, g’ is n — 1 times 
differentiable; hence by the right side of this last equation, g” is n — 1 times 
differentiable; hence g isn+1 times differentiable. By induction, it follows that 
g is smooth. For the counterexample, let f(x) = x*/4. Although f”(0) = 0, 
since f(x) > 0 for x ¥ 0, it follows that f is strictly convex on (—o0o,0) 
and on (0,00). From this it is easy to conclude (draw a picture) that f is 
strictly convex on R. Also f is superlinear and smooth, but g(y) = (3/4)|y|*/3 
(Exercise 3.3.14) is not smooth at 0. 


3.3.24 Since (f’)) (r;) = fOFY (r;) = 0 for 0 < 7 < nj — 2, it follows that r; 
is a root of f’ of order n; — 1. Also by Exercise 3.1.12, there are k — 1 other 
roots S$1,...,8,%—1- Since 


the result follows. Note if these roots of f are in (a, b), then so are these roots 
of f’. 


3.3.25 If f(a) = (@ —1r1)"'g(x), differentiating j times, 0 < 7 < nm —-1, 
shows r; is a root of f of order n,. Since the advertised f has the form 
f(a) = (@ — r;)"*gi(a) for each 1 < i < k, each r; is a root of order nj; 
hence f has n roots. Conversely, we have to show that a degree n polynomial 
having n roots must be of the advertised form. This we do by induction. If 
n= 1, then f(x) = az +b and f(r) = 0 imply ar + b = 0; hence b = —ar; 
hence f(x) = a(a — 1). Assume the result is true for n — 1, and let f bea 
degree n polynomial having n roots. If r; is a root of f of order n,, define 
g(x) = f(x)/(« — 11). Differentiating f(x) = (x — r1)g(a) 7 + 1 times yields 


FID (x) = jg (w) + (@ = 1r1)gF9 (@). 
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Inserting « = r, shows g) (71) = 0 for 0 < j < ny — 2. Thus 1; is a root 
of g of order n; — 1. If r; is any other root of f of order n;, differentiating 
g(x) = f(x)/(a—11) using the quotient rule n; — 1 times and inserting x = r; 
show g\) (r;) = 0 for 0 < j < nj — 1. Thus 1; is a root of g of order n;. We 
conclude g has n—1 roots. Since g is a degree n—1 polynomial, by induction, 
the result follows. 


3.3.26 If f has n negative roots, then by Exercise 3.3.25 
f(a) = C(a@—11)™ (a — rg)... (a@ — rR)”* 


for some distinct negative reals r;,...,rz and naturals n1,...,n, satisfying 
my +--+: +np =n. Hence g(x) = x" f(1/x) satisfies 


which shows g has n negative roots. 


3.3.27 Since the a;’s are positive, f has n negative roots by Exercise 3.3.25, 
establishing A. B follows from Exercise 3.3.24, and C follows from Exer- 
cise 3.3.26. Since the jth derivative of x* is nonzero iff 7 < k, the only terms 
in f that do not vanish upon n — k — 1 differentiations are 


n n 
x” + Gia caper (, a 1 Pana 


This implies 


n! k+1 7 k+1 
ate) = Ey (ot + ( ; Joa a+ ( k ) pur + pss) 


which implies 


oa k+1 RAN. ‘ii 


Differentiating these terms k — 1 times yields p. By Exercise 3.3.24, p has 
two roots. This establishes D. Since a quadratic with roots has nonnegative 
discriminant (Exercise 1.4.5), the result follows. 

3.3.28 Since p? > pr—-ipkti, 1 < k <n-—1, we have p? > po or pi > pil”. 
Assume pl =i > pil * Then 


k—-1)/k 
De > Dk—1Pk+1 > pl uM Pk+1 
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which implies 


k+1)/k 
pe” > pea, 
Taking the (k + 1)-st root, we obtain p,l* > oo If we have p; = pal” = 
Los pif " =m, then from the previous exercise, f(a) equals 


n n n-1 n n-1 n n 
x + mar fore + ma +m” = (& +m) 
1 n—-1 


by the binomial theorem. Hence all the a;’s equal m. 


3.3.29 Since p, p’, p”, ...are polynomials in x with integer coefficients and 
f/f, f’/f, ...are sums of products of p, p’, p”, ..., plugging « = a yields 
f(t, a) = 1; hence the result follows. 


3.3.30 Referring to the previous exercise, with p(x) = #(a—«) and f(t, x) = 
exp(tp(x)), we know f*)(t,0) and f(t, a) are polynomials in ¢ with integer 
coefficients for k > 0. Setting g(t,2) = f(t, ba), it follows that g‘*)(t,0) and 
g“)(t,p) are polynomials in ¢ with integer coefficients for k > 0. But these 


coefficients are g® (0) and gs" (p). 


Solutions to Exercises 3.4 


3.4.1 Since n! > 2”~! (Exercise 1.3.20), 


S- a< S- gi-k _ 9gl-n 


k>n+1 k>n+1 


Now choose n = 15. Then 2”! > 10+, so adding the terms of the series 
up to n = 15 yields accuracy to four decimals. Adding these terms yields 
e ~ 2.718281829 where ~ means that the error is < 1074. 


3.4.2 If n > 100, then 
n! > 101-102----- n> 100-100----- 100 = 100” 1°, 


Hence (n!)!/" > 100("-109)/", which clearly approaches 100. Thus the lower 
limit of ((n!)!/”) is > 100. Since 100 may be replaced by any N, the result 
follows. 

3.4.3 dn = f™(0)/n! = bn. 


3.4.4 Since sinh(+o00) = too, sinh(R) = R. Since sinh’ = cosh > 0, sinh is 
bijective, hence invertible. Note that cosh? — sinh? = 1, so 


cosh?(arcsinh x) = 1+ 2”. 
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The derivative of arcsinh : R > R is (by the IFT) 
1 1 1 


aresinh’ (2) = 


sinh (arcsinha)  cosh(arcsinhz) 1+ a2 


Since 1/V1+ 2? is smooth, so is arcsinh. Now, cosh is superlinear since 
cosh x > e!*! /2 and strictly convex since cosh” = cosh > 0. Hence the max in 


gly) = _ max (xy — cosh(z)) 


—co<x4r<oco 


is attained at x = arcsinhy. We obtain g(y) = yarcsinhy — 1+ y?. 
3.4.5 With a, = (—1)"/4"(n!)?, use the ratio test, 


|an| 


= 4(n + 1)? > 0. 
|an+1| 


Hence the radius p equals oo. 


3.4.6 Here neither the ratio test nor the root test works. If |z| > 1, by the 
nth term test, the series diverges, whereas if |z| < 1, the series converges 
absolutely by comparison with the geometric series. Hence R = 1. 


3.4.7 Inserting —x for x in the series yields f(—x) = ap — aiu + au? —-.... 


Hence 


But f is even iff f = f°, so the result follows. The odd case is similar. 


2 
=ajpt+agr +.... 


3.4.8 Establish the first identity by induction. If k = 1, we have 


(8) (a) Sy ee 


Now assume the identity is true for k; differentiate it to get 


d k+1 1 7 aS a; 
“da 1-2) “de (1 —a)s+1 


j=0 
= +2 
24 (1a) 
= S- —G + 1)a; hs (j + 1)a; 
loot LO a) 
j=0 g7=0 
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This establishes the inductive step. For the second assertion, note 


$2-(2)'(2 

Qn dx l-«z 
n=1 

by differentiating the geometric series under the summation sign. The result 

follows by plugging « = 1/2 into the first assertion. 


v=1/2 


Solutions to Exercises 3.5 


3.5.1 By Taylor’s Theorem, f(c +t) = f(c) + f(t + f’(n)t?/2 with 7 
between c and c+t. Since f(c+t) > 0 and f”(7) < 1/2, we obtain 


0< fl) +f (t+ t7/4, —00 <t<o. 
Hence the quadratic Q(t) = f(c)+ f’(c)t+t?/4 has at most one solution. But 
this implies (Exercise 1.4.5) f’(c)? — 4(1/4)f(c) < 0, which gives the result. 


3.5.2 For n = 1, Ais true since we can choose R,(xz) = 1. Also B is true for 

n = 1 since h(0) = 0. Now, assume that Aand B are true for n. Then 

AM (xr) — RO . 

i iy 2 eR eee a 
x—0+ i “20+ ia52 too 

since t%e-* + 0, as t — oo, and R, is rational. Since limy49_[h (ax) — 

h\”) (0)|/x = 0, this establishes B for n + 1. Now, establish Afor n +1 using 

the product rule and the fact that the derivative of a rational function is 

rational. Thus Aand B hold by induction for all n > 1. 


3.5.3 Apply the binomial theorem with v = —1/2 to obtain 


Now, replace x by —2?. 


3.5.4 If f(x) = log(1 + <2), then f(0) = 0, f(a) = 1/142), f’(2) = 
—1/(1 +2), and f(z) = (-1)""1(n — 1)!/. + 2)” for n > 1. Hence 
f(0)/nl = (1) /n or 


oo. ge 


3.5.5 If 1/(1 +2) = °9anx” and log(1 + x) = 7%, bn2™, then log(1 + 
z)/(1+2) => 0?) env”, where 


Cn = aobn “F aybn_1 Stet ae An—161 + anbo. 
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Since an = (—1)”, bn = (—1)"71/n, we obtain 


eo Ya ib = we 1 ame (et ae 


1=0 


i. 1 
=—-(-1)"(1+2+54+---+-—). 
( (eget +2) 


3.5.6 First, f(x) = f(0) + f’(0)x + h(x)x?/2 with h continuous and h(0) = 
f"(0) = 4. So 
2 
(=) — 14 he/V/aje . 
2n 
Now, apply Exercise 3.2.3 with an, = h(x/,/n)x?/2 > qu?/2 to obtain the 


result. 
3.5.7 Define h(t) by ef = 1+t+?t?h(t)/2, t £0. Then e’—1 = t(1+th(t)/2), 
and, by the exponential series, lim;_,9 h(t) = 1. Now, 


1 1 1 1 
et—1 ¢ ¢[l+th(t)/2] ¢ 
_il 1 _ 1 _th(t)/2_ Alt) /2_ 
Atesrtoyeia lee T+th()/2 1+th(t)/2° 


This shows that the limit is —1/2. 


3.5.8 This follows from the Lagrange form of the remainder in Taylor’s the- 
orem: for |” — c| < d, 


for) n h 
, (n+ i je— el? <C. 


3.5.9 With a; = f(c)/j!, by (3.5.7), we have 


S~ |ajld? = g(le| +4). 


jai 


Thus the (mn + 1)-st remainder is no larger than the tail }> 
which is no larger in absolute value than 


ann a; (x - c)4 


CoO Co 


ig Ste SE ;- le-e" 
ea b2 lag|a?-"—* < Gat be laj|d? < pag g(le| + d). 
j=nt1 g=ntl 
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3.5.10 For n = 0, this is immediate. If n > 1, 


(7) =! 3) (~9-1)-.-(-g-2+)) 


n 


ah tye ee eeeee (2n — 1) - (2n) 
(OM! 2-4-6----- (Qn) 
(—1)"(2n)! 


Solutions to Exercises 3.6 


3.6.1 Since sin” + cos? = 1, tan? +1 = 1/cos?, or cos? = 1/(1+ tan”). Hence 
cos”(0/2) = 1/(1+ 7), which gives 


Re ey) eee oe 

= ~ 14+h 148?" 

Also 2t 
ind = 2sin(@/2 6/2) = 2t cos*(9/2) = : 

sin = 2sin(0/2) cos(0/2) = 2¢ c0s*(0/2) = 

Also 
tan @ = sin@/cos@ = al 
7 1 


3.6.2 A straightforward calculation using cos? 6 + sin? 6 = 1. 


3.6.3 Use the addition formulae 


R(cos ¢, sin ¢) = (cos @ cos ¢ — sin 6 sin ¢, sin 6 cos ¢ + cos @ sin d) 
= (cos(¢ + 6), sin(¢ + 9)). 


3.6.4 By the previous exercises, we may translate and rotate the right tri- 
angle without affecting the validity of the claim. Thus we may assume the 
right angle vertex is the origin. Since every point in the plane is of the form 
(rcos0,rsin@), we may rotate the triangle so that its sides are on the axes. 
Then the result is immediate. 
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3.6.5 f is differentiable at all nonzero reals, hence continuous there. Since 
|f(x)| < |a|, f is also continuous at « = 0. Compute the variation of f 
corresponding to the partition x, = 2/(k7), k = 1,...,n. Since f(a.) = 
0 for k even and f(a.) = +2/km for k odd, the variation is larger than 
(2/m)(1/2+1/3+--+-+1/n). Hence f is not of bounded variation near 0. 


3.6.6 If « 4 0, then f’(a) = 2asin(1/x) — cos(1/zx). If « = 0, then f’(0) = 
lim,0 f(x)/x = lim, x sin(1/x) = 0. Hence | f’(x)| < 1+2|z| for all x. By 
Exercise 3.1.7, f is of bounded variation on any bounded interval. 


3.6.7 If f were injective on (0,¢€), then by the IFT for continuous func- 
tions applied on [0,«], f would be monotone. By the IFT for differen- 
tiable functions, this implies f’ > 0 on (0,€) or f’ < 0 on (0,6). But 
f'(z) = 14+ 4asin(1/x) — 2cos(1/a) for « 4 0. Since cos(1/2) oscillates 
between +1 arbitrarily close to 0, f’ takes on positive and negative values 
arbitrarily close to 0. 


3.6.8 It is enough to show that x”/n! > a"*1/(n +1)! for 0 < x < 3 and 
n > 3. But, simplifying, we see that the inequality holds iff « < n+ 1, which 
is true, sincex<3<n+1. 


3.6.9 sin(/3) = sin(a — 7/3). Hence 
sin(7/3) = sin(27/3) = 2 sin(7/3) cos(z/3), 


or cos(7/3) = 1/2, which implies sin(7/3) = V3/2 and tan(7/3) = V3. We 
obtain sin(z/2 — x) = cosx and cos(m/2 — #) = sinx. Hence sin(a/6) = 1/2, 
cos(1/6) = 3/2, and tan(m/6) = 1//3. Also 0 = cos(1/2) = 2 cos(1/4)?—1, 
so cos(1/4) = 1/V2, sin(a/4) = 1/V2, and tan(m/4) = 1. 


3.6.10 Let 0 = 7/9. Then sin(80) = sin(@) since 80 + 6 = 90 = 7. Hence 


sin @ = sin(86) 
= 2sin(40) cos(40) 
= 4sin(20) cos(26) cos(4@) 
= 8sin(0) cos(@) cos(20) cos(4@). 


Now, divide both sides by 8 sin 6. 
3.6.11 Let # = cos(20) = cos(47/5) < 0. Then 


0 = sin(6 + 40) 
= sin @ cos(40) + cos @ sin(46) 
= sin 0(2x” — 1) + 2cos @sin(20)x 
= sin O(2x” — 1) + 42 cos” O sin 
= sinO(2z” — 1+ 22(1+2)) =sin0(42? + 2x — 1). 


Solving the quadratic yields 2 = (—1— /5)/4 hence cos(/5) = (1+ /5)/4. 
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3.6.12 Let s, denote the sum on the left. Since 2cosasinb = sin(a + b) — 
sin(a — 6) from (3.6.3), with a = a and b= 2/2, 
sin(«/2)s; = sin(#/2) + 2coszsin(a/2) 
= sin(#/2) + sin(3a/2) — sin(x/2) = sin(3x/2). 


Thus the result is true when n = 1. Assuming the result is true for n and 
repeating the same reasoning with a = (n+ 1) and b= 2/2, 


sin(x/2)8,41 = sin(x/2) (sp + 2cos((n + 1)z)) 
= sin((n + 1/2)a) + 2sin(x/2) cos((n + 1)a) 
= sin((n + 1/2)a) + sin((n + 3/2)x) — sin((n + 1/2)x) 
= sin((n + 3/2)z). 


This derives the result for n + 1. Hence the result is true for all n > 1. 
3.6.13 Divide 2cos(2x) = 2 cos? a — 2sin? x by sin(2x) = 2sinx cos. 


3.6.14 The first identity is established using the double-angle formula. When 
n = 2, the identity (3.6.5) says («+ — 1) = (a? — 1)(x? + 1) and is true. Now, 
assume the validity of the identity (3.6.5) for n. To obtain (3.6.5) with 2n 
replacing n, replace x by x? in (3.6.5) and use the first identity. Then 


= oT [z* — 22? cos(kr/n) + 1] 


n—-1 n—-1 

= |x? — 22 cos(k/2n) + 1] - [2 — 2a cos(m — kr /2n) + 1] 
k=l kel 
n-1 n—-1 

= II [a* — 2x cos(k/2n) + 1] - II [2” — 2x cos((2n — k)m/2n) + 1] 


= [a* — 2x cos(k/2n) + 1] - II [x” — 2x cos(km/2n) + 1] 


= II [2? — 2x cos(km/2n) + 1] 
kAén 
1<k<2n-1 
1 2n-— 
= Gap ate a” — 2x cos(kr/2n) +1]. 
Multiplying by (x* — 1) = (x? — 1)(a? +1), we obtain the result. 


3.6.15 Let a, = 1/n and c, = cos(nxz), n > 1. Then for « ¢ 2nZ, by 
Exercise 3.6.12, the sequence b, = cy +-:- +n, n > 1, is bounded. Hence 
by the Dirichlet test, 5> cos(na)/n converges. 
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3.6.16 Fix a € R. We claim f(a) = f(0). Since f is continuous at a, given 
€ > 0, we can select 6 > 0 such that |a—al| < 6 implies |f(x)—f(a)| < €. Since 
p* = 0, we can select a period p less than 6. Then by periodicity f(0) = f(np) 
for all integers n. Select n such that np < a < (n+1)p. Then |np—al] < 6, so 


|f(np) — f(a)| < «. Thus |f(0) — f(a)| < €. Since € is arbitrary, we conclude 
f(0) = f(a). Thus f is constant. 


Solutions to Exercises 3.7 


3.7.1 With du = e*dx and u = cosa, v = e® and du = —sinadz. So 
T= fe cosedr = et cosa + f e* sinede. 
Repeat with dv = e*dx and u = sina. We get v = e® and du = cosadz. 
Hence 
fe sina dx = e” sinx -fe cos x dx. 


Now, insert the second equation into the first to yield 
I =e" cosa + le” sing — J]. 


Solving for I yields 


Ws. ; 
[ecosede =5 (e* cosx +e” sina) . 


3.7.2 Let u = arcsinx. Then x = sinu, so dx = cosudu. So 
arcsin © u 1 th fics 
e dx = | e“cosudu = 5e (sin u + cos u) 


by Exercise 3.7.1. Since cosu = V1 — x2, we obtain 
i: arcsin & 1 arcsin & 
e€ dx = se (e+ 1—2"). 


3B. 1.0 
z+1 


ei +/, 
vi=a -[a= — x? V1— 2x? 
;/ = 
=-- + arcsin © 


= —(1—.2”)'/? + arcsina = arcsing — V1 — 22. 
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3.7.4 If u = arctan 2, then 


t ee 
ie ee 


3.7.5 If u = (log)? and du = 2?dz, then v = x°/3 and du = 2(logx)da/a. 
Hence 


1 2 
[008 x)°dx = 32 (log x)’ - 5 [2 log x dx. 


If u=logz and dv = x7 dz, then v = x3/3 and du = dx/x. Hence 


1 1 1 1 
[ Plogna = pb loga— = f ade = zt loge — 52°. 


Now, insert the second integral into the first equation, and rearrange to obtain 
2 2 i 2 
fe (log x)?dax = aa (9 log” z — Glog + 2). 


3.7.6 Take u = V/1—e-2". Then u? = 1 — e~?*, so 2udu = 2e7?*dx = 
2(1 — u?)dx. Hence 


2d d 1 1 
[vireRa = fo <-/ “=u = 510s ( xt) au 
—U 


l-u 1l-wu 


which simplifies to 


[i= Fae = log (1+ vil =e?) p= le, 


3.7.7 Since |sin| < 1, F’(0) = limg49 F(x)/x = limz-40 rsin(1/x) = 0. 
Moreover, 

F'(x) = 2xsin(1/x) — cos(1/x), xc #0. 
So F”’ is not continuous at zero. 


3.7.8 If f is of bounded variation, then its discontinuities are, at worst, jumps 
(Exercise 2.3.18). But if f = F’, then f is a derivative and Exercise 3.1.6 
says f cannot have any jumps. Thus f must be continuous. 


3.7.9 Let 6 = arcsin \/x. Then x = sin? @ and the left side equals 20. Now, 
2x —1 = 2sin? §— 1 = —cos(26), so the right side equals 


m/2 — arcsin(cos(20)) = arccos|cos(26)] = 26. 
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3.7.10 Since (arcsinx)’ = 1//1— 2x? and the derivative of the exhibited 
series is the Taylor series (Exercise 3.5.3) of 1/1 —<?, the result follows 
from the theorem in this section. 


3.7.11 Integration by parts, with u = f(x) and dv = sinadz, v = — cosa, 
and du = f’(x)dx. So 


J f(osinzdr = —f(0) cos + f f'(0)cosr de. 


Repeating with u = f’(a) and du = cosa dz, 


[to sinadr = —f(x)cosx + f’(x)sinz +f "@) sin x dx. 


But this is a recursive formula. Repeating this procedure with derivatives of 
f replacing f, we obtain the result. 


3.7.12 Divide the first equation in (3.6.2) by the second equation in (3.6.2). 
You obtain the tangent formula. Set a@ = arctan(1/5) and b = arctan(1/239). 
Then tana = 1/5, so tan(2a) = 5/12, so tan(4a) = 120/119. Also tanb = 
1/239, so 


tan(4a)—tanb (120/119) — (1/239) 
1+tan(4a)tanb 1+ (120/119)(1/239) 
_ 120-239-119 | 
~ 119-239+120 © 


tan(4a — b) = 


Hence 4a — b = 7/4. 
3.7.13 Since 


arctanz =x-—>+—-7Z+H... 
I Nv x 3 5 7 


is alternating with decreasing terms (as long as 0 < x < 1), plugging in 
x = 1/5 and adding the first two terms yield arctan(1/5) = .1973 with an 
error less than the third term which is less than 1 x 10~*. Now, plugging 
x = 1/239 into the first term yields .00418 with an error less than the second 
term which is less than 10~°. Since 16 times the first error plus 4 times the 
second error is less than 10~?, 7 = 16 arctan(1/5) — 4 arctan(1/239) = 3.14 
with an error less than 1072. 


3.7.14 If 6 = arcsin(sin 100), then sin(@) = sin 100 and |@| < 7/2. But 327 = 
32 x 3.14 = 100.48, and 31.57 = 98.9, with an error less than 32 x 107? = .32. 
Hence we are sure that 31.57 < 100 < 32m or —7/2 < 100 — 327 < 0, ie., 
8 = 100 — 327. 
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3.7.15 Let u =1—2?. Then du = —2xrdx. Hence 


= 2 
/ “de = f = = 210g u = 2og(1 ~ 2°). 
1-2 u 


3.7.16 Completing the square, x? — /2x + 1 = (4 — 1/2)? + 1/2. So with 
u= 22 —1 and v = (22 — 1)? +1 = 22? — 222 +2 = 2(2? — V2r +1), 


/ 4/2 — 4x F -/ 8/2 — 8a 
@=Veti J (f2e—1? +1 
| 4/2 / 2,/2(V/2x — 1) 
= | —S—_—_ de - 2 | — at 
(/22 —1)2 +1 (22 —1)2 +1 
=f Adu 9 dv 
u2+1 a) 
= Aarctanu — 2loguv + 2log2 
= 4arctan(V2x — 1) — 2log(a? — V2x + 1). 


dx 


3.7.17 Let s,(a) denote the nth partial sum in (3.7.3). Then if 0 < a <1, 
by the Leibnitz test, 


Son() < log(1 + 2x) < San_1(2), n> 1. 


In this last inequality, the number of terms in the partial sums is finite. 
Letting « 7 1, we obtain 


Son(1) < log2 < san_1(1), n> 1. 


Now, let n 7 co. 
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Solutions to Exercises 4.1 


4.1.1 The first subrectangle thrown out has area (1/3) x 1 = 1/3, the next 
two each has area 1/9, the next four each has area 1/27, and so on. So the 
areas of the removed rectangles sum to (1/3)(1 + (2/3) + (2/3)? +...) =1. 
Hence the area of what is left, C’, is zero. At the nth stage, the width of each 
of the remaining rectangles in C/, is 37”. Since C’ C C’,, no rectangle in C’ 
can have width greater than 3~”. Since n > 1 is arbitrary, no open rectangle 
can lie in C’. 
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4.1.2 Here the widths of the removed rectangles sum to a/3+2a/3?+4a/3?4 

- = a, so the area of what is left, C%, is 1— a > 0. At the nth stage, the 
width of each of the remaining rectangles in Cf is 3-"a. Since C™ C C?, no 
rectangle in C® can have width greater than 3~”a. Since n > 1 is arbitrary, 
no open rectangle can lie in C®. 


4.1.3 Here all expansions are ternary. Since [0, 2] x [0, 2] = 2Co, given (x, y) € 
Co, we have to find (2’,y’) € C and (2”,y”) € C satisfying 2’ + 2” = 2x 
and y’ + y” = 2y. Let x = .djdgd3... and y = .e,e2e3.... Then for all 
n> 1, 2d, and 2e, are 0, 2, or 4. Thus there are digits d},, d’, e),, el n> 1, 
equalling 0 or 2 and satisfying di, + d’ = 2d, and e/, + e”” = 2e,. Now, set 
x’ = di dod,..., y' = .ejeges..., 2” = .d{dydy..., y” = .efeges... . 


Solutions to Exercises 4.2 


4.2.1 If Q is a rectangle, then so is —Q and ||Q|| = ||—Q]|. If (Qn) is a paving 
of A, then (—Q,,) is a paving of —A, so 


area (—A) < y, |-Qn| = Py Qn. 
n=1 n=1 


Since area(A) is the inf of the sums on the right, we obtain area(—A) < 
area(A). Applying this to —A, instead of A, yields area(A) < area(—A) 
which yields reflection invariance, when combined with the previous inequal- 
ity. For monotonicity, if A C B and (Qn) is a paving of B, then (Q,) is a 
paving of A. So area(A) < 37°, ||Qn||. Since the inf of the sums on the right 
over all pavings of B equals area(B), area(A) < area(B). 


4.2.2 Let L be a line segment. If L is vertical, we already know that area (L) = 
0. Otherwise, by translation and dilation invariance, we may assume that 
L={(z,y):0<a2<1,y=mze}. If Q; = [(¢-1)/n,i/n] x [m(i—-1)/n, mi/n], 
i=1,...,n, then (Q1,...,Qn) is a paving of L and 7¥_, ||Q;|| = m/n. Since 
n > 1 is arbitrary, we conclude that area(L) = 0. Since any line L is a 
countable union of line segments, by subadditivity, area (L) = 0. Or just use 
rotation invariance to rotate DL into the y-axis. 


4.2.3 Assume first the base is horizontal. Write P = AU BUC, where A and 
B are triangles with horizontal bases and C is a rectangle, all intersecting 
only along their edges. Since the sum of the naive areas of A, B, and C is 
the naive area of P, subadditivity yields 


area(P) < area(A) + area (B) + area(C) 
= |All + BI + (Cll = PI. 
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To obtain the reverse inequality, draw two triangles B and C' with horizontal 
bases, such that PU BUC is a rectangle and P, B, and C' intersect only 
along their edges. Then the sum of the naive areas of P, B, and C' equals the 
naive area of PU BUC, so by subadditivity of area, 


|Pll + Bll + llCll = ||PU BUC 
= area(PUBUC) 
area (P) + area (B) + area (C) 


+ Bll + Cl. 


< 
< 


area (P 


es 


Canceling ||B]| and ||C||, we obtain the reverse inequality ||P|| < area(P). 
The case with a vertical base is analogous. 


4.2.4 If T is the trapezoid, T can be broken up into the union of a rectangle 
and two triangles. As before, by subadditivity, this yields area(T) < ||T'|. 
Also two triangles can be added to T to obtain a rectangle. As before, this 
yields ||T'|| < area (T). 


4.2.5 By extending the sides of the rectangles, AU B can be decomposed 
into the union of finitely many rectangles intersecting only along their edges 
(seven rectangles in the general case). Moreover, since AM B is one of these 
rectangles and is counted twice, using subadditivity, we obtain area(A U B) < 
area (A) + area (B) — area(AN/ B). To obtain the reverse inequality, add two 
rectangles C and D, to fill in the corners, obtaining a rectangle AUBUCUD, 
and proceed as before. 


4.2.6 If Q is a rectangle, then H(Q) is a (possibly degenerate) rectangle and 
||H7(Q)|| = || - ||Q||. If (Qn) is a paving of A, then (H(Q,,)) is a paving of 
H(A). So 


area[H(A)] < $7 ||H(Qn)ll = Do AL NQnll = 1k] 0 Qui. 


Taking the inf over all pavings of A yields area[H(A)] < |k|- area(A). This 
establishes the result for H when k = 0. When k ¥ 0, H is a bijection. In 
this case, replace in the last inequality A by H~1(A) and k by 1/k to obtain 
|k|-area(A) < area(H(A)). Thus area [H(A)] = |k|-area(A) for k 4 0, hence 
for all k real. The case V is similar. 


4.2.7 Let T(x, y) = (ax + by, cx + dy), T'(x, y) = (aa +d'y,’x + d'y). Then 


ToT" (x,y) = (a(a’'a + d'y) + O(c'x + d’'y), c(a'x + b'y) + d(c'x + d’y)) 
= (a"2 + b’y,c'n + d"y) 


a” b"\ — faa’ + be! ab! + bd’ 
cd") ~~ \ca' + de cb! + dd’. 


where 


362 A Solutions 


This shows T” = To T” is linear. Now 
det(T”) = ad" — bc" = (aa’ + bc’) (cb! + dd’) — (ab’ + bd’) (ca’ + dc’) 
simplifies to (ad — bc)(a’d’ — b'c). 


4.2.8 If area(T(B)) = |det(T)|area(B) for all B and area(T’(A)) = 
| det(T’)| area (A) for all A, replace B by T’(A) to obtain 


area (To T’(A)) = area(T(T"(A))) = | det(T)| area (T’(A)) 

= | det(T)|| det(T’)| area (A) = | det(T 0 T’)| area (A). 
4.2.9 Using matrix notation, 
aby _ (10). (15), (a0 q (49) _ (29), (19), /19 
od) \oa)°\01)°\o1) *™" ed) ~ \o1)°?\e1) ° oa): 
Thus U=VoSoH and L=HoSoV. Now 

a0\ fle\_/fa ae \_ ab 

cf O01) \eftec) \cd)’ 


if e = b/a and f = d— ec. Hence every T may be expressed as Lo U when 


a #0. When a =0, 
0b\  /01 5 cd 
cd} \10 0b)’ 


so in case we have T = FioU. 


4.2.10 If (a,b) is in the unit disk, then a? + 6? < 1. Hence |a| < 1. Hence 


(v2—a) + (0-0) > |v2—a| > v2—|al > V2-1. 


Hence d(D, {(V/2,0)}) > 0. For the second part, let (an) denote a sequence 
of rationals converging to V2. Then (an,0) € Q x Q, and 


d((an,0), (V2,0)) > n fo. 


4.2.11 For 1 >h > 0, let D* = {(z,y) € Dt : y > h}. Then Dt \ Di} is 
contained in a rectangle with area h, and Dj and D~ = {(x,y) € D: y < 0} 
are well separated. By reflection invariance, area(D*t) = area(D~). So by 
subadditivity, 


area (D) > area(D} UD~) = area(D}) + area( 
> area(D*) — area(D* \ Dj) + area( ae 
> 2-area(D*) —h. 


Since h > 0 is arbitrary, we obtain area(D) > 2- area(Dt). The reverse 
inequality follows by subadditivity. 
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4.2.12 The derivation is similar to the derivation of area(C’) = 0 presented 
at the end of §4.2. 


4.2.13 If T is the triangle joining (0,0), (a,b), (a, —b), where 


(a, b) = (cos(a/n), sin(z/n)), 


then area(T') = sin(z/n) cos(a/n). Since D, is the union of n triangles T}, 

.., In, each having the same area as T (rotation invariance), subadditiv- 
ity yields area(D,) < nsin(a/n) cos(r/n) = nsin(27/n)/2. The reverse in- 
equality is obtained by shrinking each triangle T; toward its center and using 
well-separated additivity. 


4.2.14 Let a denote the inf on the right side. Since (T;,) is a cover, we obtain 
area(A) < 0°, area(Tn) = 0°, ||Tn||. Hence area(A) < a. On the other 
hand, if (Qn) is any paving of A, write each Q, = T, UT’, as the union of 
two triangles to obtain a triangular paving (T;,) U (T’,). Hence 


aS SF [Tall + Thal = do nll. 
n=1 n=1 
Taking the inf over all pavings (Q,,), we obtain a < area(A). 


4.2.15 Assume not. Then for every rectangle Q, area(QM A) < a-area(Q). 
If (Q,,) is any paving of A, then (Q,™ A) is a cover of A. So 


area(A) < 2 area(Q,MA) <a S- area (Qn) - 


n=1 n=1 


Taking the inf of the right side over all (Q,), area(A) < a- area(A). Since 
a <1, this yields area(A) = 0. 
Solutions to Exercises 4.3 


4.3.1 Apply dilation invariance to g(x) = f(x)/a. Then 


ia f(ka)a—* dx = kf f(kax)(ka)~* da = a g(kx) dx 
=e xf aaa = [ teoe*ac. 
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4.3.2 Let G be the subgraph of f over (a,b), and let H(a,y) = (—2,y). 
Then H(G) equals {(-z,y): a <a < b,0 <y < f(x)}. But this equals 
{(z,y):-b<a<-a,0<y< f(—x)}, which is the subgraph of f(—2) over 
(—b, —a). Thus by Exercise 4.2.6, 


b 
/ f(a) dx = area(G) = area(H(G)) = ; f(—2) da. 


4.3.3 Since f is uniformly continuous on [a,b], there is a 6 > 0, such that 
p(0) < €/(b—a). Here p is the uniform modulus of continuity of f over [a, }]. 
In §2.3, we showed that, with this choice of 6, for any partition a = x < 
Ly <+++ < @, = 6 of mesh < 6 and choice of intermediate points a®, anita 
a#, the piecewise constant function g(x) = f(x"), x1 <«@<aj,1<i<n, 
satisfies | f(a) — g(a)| < €/(b— a) over (a,b). Now, by additivity , 


b n 
i [o(2) + e/(b— al] de =) fleb (ar — m1) £6 


and 
g(a) — €/(b—a) < f(a) < g(a) + €/(b-—a), ax<au<bd. 


Hence by monotonicity, 


4.3.4 Since f(x) < x = g(x) on (0,1) and the subgraph of g is a triangle, by 
monotonicity, is f(a) dx < 1. x dx = 1/2. On the other hand, the subgraph 
of g equals the union of the subgraphs of f and g — f. So by subadditivity, 


1/2= | roc i “H(a) de + i: ‘lol2) — F(a)) ae 


But g(x) — f(x) > 0 iff  € Q. So the subgraph of g — f is a countable (§1.7) 
union of line segments. By subadditivity, again, the area fo (g(x) — f(x) dx 
of the subgraph of g — f equals zero. Hence ih f(x) dx = 1/2. 
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4.3.5 Suppose that g is a constant c > 0, and let G be the subgraph of f. 
Then the subgraph of f + c is the union of the rectangle Q = (a,b) x (0, c] 
and the vertical translate G + (0,c). Thus by subadditivity and translation 
invariance, 
b 
[ [#@) + (@)] de = arealQu (6 + (0,0) 
< area (G) + area (Q) 


= [se i+ [9 de, 


Let aQ denote the centered dilate of Q, 0 < a < 1. Then aQ and G + (0,c) 
are well separated. So 


b 
/ [F(e) + g(2)] de > arealaQ U(G + (0,0))] 


= area(G) + a” area(Q) 
b b 
= i f(a) dx + a | g(x) da. 


Let a — 1 to get the reverse inequality. Thus the result is true when g is 


constant. If g is piecewise constant over a partition a = % < 71 <<+:: < 
Ln = b, then apply the constant case to the intervals (7;-1,2;),7 =1,...,n, 
and sum. 


4.3.6 By subadditivity, [>° f(z)dx < O°, f” , f(w) dx = | en. For 
the reverse inequality, 


0 0 


Now, let N A co. This establishes the nonnegative case. Now, apply the 
nonnegative case to |f|. Then f is integrable iff S>|c,| < co. Now, apply 
the nonnegative case to f+ and f~, and subtract to get the result for the 
integrable case. 


4.3.7 To be Riemann integrable, the Riemann sum must be close to a specific 
real J for any partition with small enough mesh and any choice of intermediate 
points a®, ..., 2%. But for any partition a = zp < 21 < +++ < 2m = b, with 
any mesh size, we can choose intermediate points that are irrational, leading 
to a Riemann sum of 1. We can also choose intermediate points that are 
rational, leading to a Riemann sum of 0. Since no real J can be close to 0 
and 1, simultaneously, f is not Riemann integrable. 


366 A Solutions 
4.3.8 Apply the integral test to f(x) = g(x) to get 
[ g(xd) da < Sa) < [ g(ad) dx + g(6). 
n=1 
By dilation invariance, 
[ dx < >> g(nd) < i g(x) dx + 6g(6). 
n=1 


Since g is bounded, dg(5) > 0, as 6 > 0+. Now, let 6 + 0+, and use 
continuity at the endpoints. 


4.3.9 If f is even and nonnegative or integrable, by Exercise 4.3.2, 


[ seoae= [seas [ sea 


= [sade [sear 
=2 [soar 


If f is odd and integrable, 


[ seoa= fi seas f 10) 


= [ i-aaes [rede =o. 


4.3.10 By the previous exercise, [°~) e~ 4"! dx = 2 il e %* dx. By the inte- 
gral test, [7° e~ 9 dx < yr, e- =1/(1 — e~*) < oo. Hence 


love) 1 ee) 
1 
| ede = f ede + f e "dx <1+ = 
0 0 1 Lee? 


4.3.11 Since f is superlinear, for any M > 0, there is a b > O, such 
that f(x)/z > M for x > b. Hence f° ese S@ dx < [Pele Mr dz = 
J, ele! dx. Similarly, there is an a < 0 such that f(x)/(-z) > M 
for « <a. Hence [°_ e*e!@) dg < [* ec dz = [% eo Oo ltl ag. 
Since f is continuous, f is bounded on (a,b), hence integrable over (a, b). 
Thus with g(a) = e&*e-f@) 


: 
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oe) a b oo 
/ ee f) dx = i g(x) da + / g(x) da + | g(x) dx 
—oco —oo a b 


a b oo 
< ; eM tall doe + / este F@) da + / ea )lel dep 
—oo a b 


which is finite, as soon as M is chosen > |s|. 


4.3.12 Suppose that there was such a 0. First, choose f(a) = 1, for alla € R, 
in (4.3.13) to get f°. 5(a) dx = 1. Thus 6 is integrable over R. Now, let f 
equal 1 at all points except at zero, where we set f(0) = 0. Then (4.3.13) 
fails because the integral is still 1, but the right side vanishes. However, this 
is of no use, since we are assuming (4.3.13) for f continuous only. Because of 
this, we let f,(%) = 1 for |a| > 1/n and f,(x) = n|2| for |a| < 1/n. Then fy, 
is continuous and nonnegative. Hence by monotonicity, 


ee i Os I. 6(x) fn(x) dx = fn (0) = 0, 


for all n > 1. This shows that J 
continuity at the endpoints, 


1= f sayae= f° ateyae+ fa) ae 


—1/n co 
=e ( / 5(a)dx+ | 6(z) is) 
n Aco 65 1/n 


= lim (0+0) =0, 
noo 


a|>1/n d(a) da = 0 for all n > 1. But, by 


a contradiction. 


4.3.13 By Exercise 3.3.7 with e =c+0, 


f(c+0)— f(o) => +f (c)Oo. 


Since f/, is increasing, 


fi. (b> / fila): 


Also since f’ is increasing, 


ct+é 
f(b < | f(a) dx. 


Combining these inequalities yields (4.3.14). Now note that f{ are both in- 


creasing and therefore bounded on [a,b] between f{(a) and f'(b), hence 
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integrable on (a,b). Select n > 1 and let 6 = (b—a)/n and a= 29 < a1 < 
+++ <2, = b be the partition of [a,b] given by 2; = a+i0, i =0,...,n. Then 
applying (4.3.14) at c = a; yields 


F(tini1) — f(a) = in fi. (a) dz. 


Summing over 1 <i <n-—1, we obtain 
b-6 


f(0)—Fa+8)> f fy(e)ae. 


a 


Let 6 + 0. Since f is continuous (Exercise 3.3.6) and integrable, continuity 
at the endpoints implies 


Similarly, 


Summing over 1 <i < n—1, we obtain 
b 
f(o—46)—-fla)s< fl (a) de. 
a+6 
Now let 6 — 0. Since f’ (t) < fi. (t), the result follows. 


4.3.14 From 84.3, we know lim,_,.. F (nz) exists; it follows that any sub- 
sequence (F'(N,,7)) is convergent. Given b, — oo, let Nn = [b,/7|. Then 


|b, — Nyw| < and 
bn gt 
sin x 
| ie 
Nyt x 


It follows that F'(oo) exists. Since F’ : (0,co) > R is continuous, it follows F’ 
is bounded as in Exercise 2.3.24. 


|F(bn) — F(Nnt)| = 


Solutions to Exercises 4.4 


4.4.1 F(x) = e-*”/(—s) is a primitive of f(x) = e~*”, and e ** is positive. So 


co 


Oe ese 1 1 
i e “dr= —e | =-, s>0. 
0 = 0 s 
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4.4.2 x"/r is a primitive of x’~! for r 4 0, and logz is a primitive, when 
r = 0. Thus i da/x = log a|, = 0— (—o00) = 00 and [°° da/x = loga|?? = 
co — 0 =o. Hence all three integrals are equal to co, when r = 0. Now, 


4 1 

1 1 =, > 0, 

| xv !dre=-—-= lim v{3 P 
0 ro 7 20+ oo, r<0. 

Also 

a 1 1 —1 0 
/ hdr} tim ar bof RS a. 
1 T &+00 r oo, r>0. 


Since [5° = fo + f°, [o° 21 da = oo im all cases. 


4.4.3 Pick c in (a,b). Since any primitive F differs from F’, by a constant, 
it is enough to verify the result for F.. But f bounded and (a,b) bounded 
imply f integrable. So F.(a+) and F,(b—) exist and are finite by continuity 
at the endpoints. 


4.4.4 By the fundamental theorem, for T > 1, 


T 
f(T) — f(A) = F(T-) — fH) = / f(a) de. 


The result follows by sending T — oo and using continuity at the endpoints. 


4.4.5 By the integral test, )7°-_, f’(m) < co iff [7° f’(x) dx < oo which by 
the previous exercise happens iff f(co) < oo. Similarly }7°°_, f’(n)/f(n) < oo 
iff log f(00) < co. 


4.4.6 Take u = 1/2 and dv = sinzdr. Then du = —dz/x? and 


v = —cosz. So 
bo: a b 
sin x —cosx cosx 
dx = + 5 dz. 
1 x x 1 1 x 


But, by (4.3.1), cosa/x? is integrable over (1,00). So by continuity at the 
endpoints, 


: > sing cos 2 
lim 


de = cos + 
b> 00 1 x 1 


Since F'(b) — re sin x(dx/x) does not depend on b, F'(oo) exists and is finite. 


dx. 


4.4.7 The function g(t) = e~* is strictly monotone with g((0,0o)) = (0,1). 
Now, apply substitution. 


4.4.8 Let u = 2” and dv = e-** dz. Then du = na”~!dz, and v = 
e **/(—s). Hence 
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Co —sx,,.n |CO oo 
é€ av n 
| e "a" dx = + “| e 8% y"—h dy. 
) —s ) 


P 8 
If we call the integral on the left [,, this says that I, = (n/s)In-1. Iterating 
this down to n = 0 yields I, = n!/s” since Ip = 1/s from Exercise 4.4.1. 


4.4.9 Call the integrals J, and I.. Let u = sin(sx) and dv = e~" dx. Then 
du = scos(sx) dx and v = e~"*/(—n). So 


‘— mn) + a e "” cos(sx) dx = eT. 
—n o 6 Jo n 
Now, let u = cos(sz) and dv = e-" dx. Then du = —ssin(sx) dx and 
v=e "*/(—n). So 
—nx oS co 1 
I= = cos(sz) _ a ee” sin(sx) dx = — — =f. 
—n o 6 Jo non 


Thus nJI, = sI,, and nI, = 1—sI,. Solving, we obtain I, = s/(n? +s?) and 
I, =n/(n? + 8”). 


4.4.10 Let u = t®~! and dv = e~/?tdt. Then du = (x — 1)t?~ dt, and 
v = —e-*/2, So 


| et /242 dy = =e /ge-1| +(a- » [ et /242—2 gy 
0 0 0 
= (« — yf eo #/2 42-2 ge 
0 


If In = fy? e-* /2¢" dt, then 


Ton41 =2n- Ton—1 =2n- (2n — 2)Ion—3 
=... =2n-(Qn—2)...4-2-F, = 2h. 


But, substituting u = t?/2, du = tdt yields 


n=f Pra = f e“du=1. 
0 0 


4.4.11 Let u = (1—t)” and dv = t®~1 dt. Then du = —n(1 —t)"~!dt, and 
v = t*/x. So 


1 N4x 
| (1—#)"¢- dt = (1 2)"t® 
) 


x 


1 


1 
+f (1 —t)? 142+! de 
0 Jo 


x 


1 
= al (1 — t)? 247+! ae. 
0 
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Thus integrating by parts increases the x by 1 and decreases the n by 1. 
Iterating this n times, 


1 -(n—1)e-eeel 1 
1—+)"t*-! dt = a | fetr—l ge. 
x ) u:-(a@+1)-----(w@+n—-1) Jo 


But f) t?+"-1 dt =1/(x +n). So 


. Nii de n! 
fa-e eG aa 


4.4.12 Let t = —logz, x = e*. Then from Exercise 4.4.7 and Exercise 4.4.8, 


1 ee) 
‘ (—log x)" dx = | ee dt = al. 
0 0 


4.4.13 Let =f" (2? — 1)" de and u = (2? — 1)", dv = de. Then 


1 
L, = x(x? -— 18s — 2n | (a? — 1)"-+2? dz 
-1 
= —2nI,, — 2nIn_1. 


Solving for I,, we obtain 


2n n 2+ (2n—2)+++-- 2 
= L3 = 2-1" ————_ 
2n+1 (2n+1)-(2n—1)----- 3 


n 


since Ip = 2. 


4.4.14 Let f(x) = (x?—1)". Then P, (x) = f(™(a)/2"n!. Note that f(+1) = 
0, f/(+£1) =0,..., and f-)(41) = 0, since all these derivatives have at 
least one factor (x? — 1) by the product rule. Hence integrating by parts, 


1 1 
/ f(x) f (w) de = — / FOOD (a) fO+D («) dao 
7 -1 


increases one index and decreases the other. Iterating, we get 


[Per de = aE [. [P@] a= 4 ima af fet fO% Ca 
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But f is a polynomial of degree 2n with highest-order coefficient 1. Hence 


f2”) (x) = (2n)!. So 


fi roatae SREB fa 


Now, inserting the result of the previous exercise and simplifying leads to 
2/(2n +1). 


4.4.15 Apply the fundamental theorem to the result of Exercise 3.7.11. 
4.4.16 With 7 = a/b and 


(bx)" (a — bx)” 


o] 


it is clear that [,, > 0 since sina and g,(x) are positive on (0,7). Moreover 
In € Z follows from Exercise 3.3.30 and Exercise 4.4.15. Finally, [, < 
2M"/n! where M = max ba(a — bx) over [0,7], which goes to 0 as n > oo. 
But there are no integers between 0 and 1; hence 7 is irrational. 


4.4.17 By the integral test, ¢(s) differs from [;* 2~* dx by, at most, 1. But 
J? 278 dx converges for s > 1 by Exercise 4.4.2. 


4.4.18 Here f(x) = 1/ax and res x) dz = log(n +1). Since 1/(n+1) > 0, 
by the integral test, 


li eee: 
= lim —-+— 
7 noo 


he) 
+---+—-—logn 
2 n 


1 
5 
1 1 1 1 
ae | 1 
dim [te 54 5+ eae er og(n + ) 


ee 7a) + f(2) +--+ + f(n) - f° 5 as| 


exists and satisfies 0 < y < 1. 


4.4.19 Call the integrals [6 and I>. Then Jj = 0, and If = 0, n > 0, since 
the integrand is odd. Also [5 = 2 fo x sin(nx) dx since the integrand is even. 
Now, for n > 1, 


| xsin(nx) dx = oe) +f cos(nx) da 
0 ” o Jo 
__ 7cos(nm) n 1 sin(na) |" 
mr n n 0 
_ (-1)"-1n 
a n 


Thus [§ = 27(-1)""!/n, n> 1. 
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4.4.20 By oddness, [”_ sin(nx) cos(mx) dx = 0 for all m,n > 0. For m # n, 
using (3.6.3), 


[ sin(nz) sin(ma) dx sf [cos((n — m)x) — cos((n + m)x)] da 


2 n-m n+m 


—T 


ic sin(nx) sin(max) dx = sf. [1 — cos(2na)| dx 
= gle — sin(2na)/2n] : 


Similarly, for ["_cos(na) cos(mz) dx. Hence 


[ cos(n2) cos(ma) da = [ sin(na) sin(ma) dx = : n if m, 


TT TT 


4.4.21 7 additivity, q(t) = at? + 2bt + c where a = =f g(t)? dt, b = 


ie f(t)g(t) dt, and c = fst fi e 2 dt. Since q is nonnegative, q - at most 
one a ee b? — ac < 0, which is Cauchy—Schwarz. 


4.4.22 By substituting t = n(1— s), dt = nds, and equation (2.3.2), 


"1—(1-t)” 1-5" 
| awe | a 
0 t 0 1l-s 


4.4.23 Continuity of F' was established in §4.3. Now, f is continuous on the 
subinterval (2;-1,2;). Hence F(a) = F(a;-1) + Fy,_,(«) is differentiable by 
the first fundamental theorem. 


4.4.24 For any f, let I(f) = rig f(x) dx. If g : [a,b] + R is nonnegative 
and continuous, we can (§2.3) find a piecewise constant g. > 0, such that 
gc(@) < g(a) +e < g-(w) + 2e ona < a <b. By monotonicity, 


T(ge) < Hg +e) < I(ge + 2€). 
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But, by Exercise 4.3.5, I(g +) = I(g) + €(b— a) and I(g. + 2€) = I(ge) + 
2e(b — a). Hence 


T(ge) < I(g) + €(b— a) < (ge) + 2€(6— a), 
or 
[Z(g) — I(ge)| < €(b— a). 
Similarly, since f(x) + g-(x) < f(a) + g(a) +e < f(a) + ge(x) + 2€, 
(Ff + 9) — I(f) — T(ge}| < €(b— a), 


where we have used Exercise 4.3.5, again. Thus |J(f + g) — I(f) — I(g)| < 
2e(b — a). Since € is arbitrary, we conclude that I(f + g) = I(f) +I(g). 


4.4.25 Let m; = g(ti), 7 = 0,1,...,n +1. For each i = 1,...,n +1, define 
#;:(m,M) > {0,1} by setting #;(x) = 1 if x is between m,;_; and m,; and 
#:(x) = 0, otherwise. Since the m,’s may not be increasing, for a given 2, 
more than one #;(z), i= 1,...,n +1, may equal one. In fact, for any x not 
equal to the m,’s, 


#(2) = #i(@) +--+ + #41 (2). 


Since G is strictly monotone on (t;_1, t;), 


[is f(g i) 6 = [ terar= f° seovtsorar 


ti- 


Now, add these equations over 1 <i <n+1 to get 


[50 flo(t))Ig'(t)| at = : [ * F(a) a) de 


M nti M 
= | So ta)#ia) dx = | f(e)#(o) de. 


We gq 
Here the last equality follows from the fact that #(x) and yet #i(x) differ 
only on finitely many points in (m, M). 


4.4.26 Let vy(a, 6) be the total variation of F' over [a,b] and assume first f’ 
is continuous on an open interval containing [a,b]. If a = 1 < a1 <-++ < 


In = bis a partition, then 
Xi 
< fo | @lae 
Di-1 


by the fundamental theorem. Summing this over 1 < 7 < n yields the first 
part. Also since this was any partition, taking the sup over all partitions 
shows 


|f(vi) — f(@i-1)| = ‘(x) dx 
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b 
vs (a,b) < / Lf (@)| de. 


Call this last integral J. To show that vy (a,b) equals I, given €, we will exhibit 
a partition whose variation is within e of J. Now, since | f’| is continuous over 
[a,b], |f’| is Riemann integrable. Hence (Exercise 4.3.3), given € > 0, there 
is a partition a = % < 2, < ++: < &, = b whose corresponding Riemann 
sum yy | f’ (a? )| (a; — 24-1) is within e of I, for any choice of intermediate 
points at i= 1,...,n. But, by the mean value theorem, 


qo? 


n 


So lf (@i) — f(@i-1)| = o> |f!(@?)|\ (ai — 24-1) 


i=1 


for some intermediate points a®, i = 1,...,n. Thus the variation of this 
partition is within ¢ of J and we conclude vy(c,d) = big |f’(a)| dx for all 
[c,d] < (a,b). Now vg(a,b) = [| f"(a)| da follows. 


4.4.27 Let p denote the integral. Using the equation y = V1-— <2? for 
the upper-half unit circle, we have y/ = —a//1— «x7; hence /1+y? = 
1/V1—<2?, hence the formula for p. Since x? < x on (0,1), it follows that 
Vl—a? >/1—< on (0,1). Thus 


1 1 
dx dx 
p=2f Bical = —4/1— a|j =4. 
0 Vl—2? 0 Vl-2« lo 
Here we used the fact that the integral is even on [—1, 1] and the primitive 
of 2/V1—a is -4V/1— <2. 


4.4.28 If y > 0, then sin@ > 0, so 0 < 6 < a and the length of the counter- 
clockwise arc joining (1,0) to (a, y) is 


I dx 


- cos 0 V1 — a2 


Now the substitution « = cost transforms the integral to fe dt = 0. When 
y <0, sin? < 0, hence t < 6 < 2a. Now the lower-half unit circle is y = 


—V1-—2?; hence \/1 4+ y’ =1/V1-— «22, so 


cos @ dz 


ail V1 — 22° 


L 


L=n4+l'=n+ 


Now the substitution « = cost transforms the integral to fia dt = 0—T; hence 
the result follows. 
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4.4.29 0'(%) = —1/V1-— 2? follows from the FTC. @ is continuous by con- 
tinuity at the endpoints, and @ is strictly decreasing since 0’ < 0. Since c is 
the inverse of 0, c’ = —V/1 — c? follows from the IFT. Since s = V1 — c?, this 
implies c’ = —s. Also s! = (1/2)(1 — c?)~1/?(—2cc') = —cc'/s =. 


4.4.30 We use the integral formula for the remainder in Taylor’s theorem to 
get, for ja —c| < d, 


|ar == Cala ’ n 
Ryzil(a,c) < (n+ a < g(e+ s(x —c))(1 — 8)" ds. 
0 


But this last integral is no larger than ic g(c+s(x—c))dx; hence the remainder 
goes to zero. 


4.4.31 Integration by parts applied on (0, b) yields that (fG)(0+) exists and 


b b 
| Fla)g(2) dx = F(0)G(0) — (FE) 0+) ~ f f'(a)G(a)dx. — (A.4.1) 


Now suppose |G(a)| < M. Since f’ < 0, by continuity at the endpoints, 


lee) b 
[lf @etejae <a jim [ (-F(@) ae 
= M Jim (f(1) ~ f() = MFC). 


Thus f’G is integrable over (0,00). Sending b — oo in (A.4.1), the result 
follows since f (co) = 0. 


Solutions to Exercises 4.5 


4.5.1 Let Q = (a,b) x (c,d). If (x,y) € Q and (2’,y’) ¢ Q, then the dis- 
tance from (x,y) to (z’,y’) is no smaller than the distance from (x, y) to the 
boundary of Q, which, in turn, is no smaller than the minimum of |x — al 
and |a — 5]. 

4.5.2 Let Qn = (—1/n,1/n) x (—1/n,1/n), n > 1. Then Q,, is an open set 
for each n > 1, and ()7-_, Qn is a single point {(0,0)}, which is not open. 
4.5.3 If Q is compact, then Q° is a union of four open rectangles. So Q° is 
open. So Q is closed. If C,, n > 1, is closed, then Cf is open. So 


(he) -e 


n=1 
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is open. Hence (}>~_, Cy is closed. Let Qn, = [0, 1] x [1/n, 1], n > 1. Then Qn 
is closed, but 


Qn = (0, 1 x (0, 1] 


iC 8 


is not. 


4.5.4 It is enough to show that C° is open. But C® is the union of the 
four sets (draw a picture) (—oo,a) x R, (b,oo) x R, (a,b) x (—o0, 0), and 
{(z,y):a<u<b,y> f(x)}. The first three sets are clearly open, whereas 
the fourth is shown to be open using the continuity of f, exactly as in the text. 
Thus C is closed. Since C' contains the subgraph of f, area(C) > ri f(a) dx 
On the other hand, C is contained in the union of the subgraph of f+e/(1+2?) 
with DL, and Ly. Thus 


area(C) < firw@ + €/(1+27)] dx + area (Lq) + area (Ly) 


a 


b fore) 
dx 
= : fayaete fo tas fin f(a dete f ee 


= ]| f(x)dut+en. 


a 


Since € > 0 is arbitrary, the result follows. 


4.5.5 Distance is always nonnegative, so (<> means iff), 


C is closed <=> C* is open 
<> d((x,y),(C%)°) > 0 iff (a, y) € C® 
<> d((z,y),C) > 0 iff (x, y) € C® 
<> d((z,y),C) = 0 iff (2, y) EC. 


This is the first part. For the second, let (%,y) € Gy. If a = 1/n - 
d((z,y),C) > 0 and |x — 2'| < «6 ly—y’'| < ¢, then d((2’,y’),C) < 
d((a,y),C) + 26 = 1/n + 2e — a, by the triangle inequality. Thus for 
e < a/2, (a’,y’) © Gn. This shows that Q. C Gn, where Q, is the open 
rectangle centered at (x,y) with sides of length 2e. Thus G,, is open, and 
V1 Gn = {(z, y) : d(x, y), C) = 0}, which equals C. 


4.5.6 Given € > 0, it is enough to find an open superset G of A satisfying 
area (G) < area(A) + 2c. If area(A) = 00, G = R? will do. If area(A) < cw, 
choose a paving (Q,,), such that S°°~_, area(Qn) < area(A) + €. For each 
n > 1, let Q/, be an open rectangle containing Q,, and satisfying area (Q/,) < 
area(Q,,) + €2~”. (For each n > 1, such a rectangle Q/, can be obtained by 
dilating Q° slightly.) Then G = U7, Q’, is open, G contains A, and 
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area(G) < S- area (Qi,) < 2 (area (Q,) + €27”) < area (A) + 2e. 


For the second part, let a = inff{area(G) : A C G,G open}. Choosing G as 
above, a < area(G) < area(A) + € for all «. Hence a < area(A). Conversely, 
monotonicity implies that area(A) < area(G) for any superset G. Hence 
area(A) <a. 


4.5.7 For each « > 0, by Exercise 4.5.6, choose G, open such that A C G, 
and area(G,) < area(A) +. Let I = (V7, Gin. Then I is interopen, 
and area (I) < inf,>1 area (G1/,) < infn>1(area(A) + 1/n) = area(A). But 
I> A. So area (I) > area (A). 

4.5.8 If M is measurable, select an interopen superset J > M satisfying 
area (I) = area(M). With A = I in the definition of measurability, we obtain 
area (I — M) = 0. Conversely, suppose such I exists and let A be arbitrary. 
Then 

ANM®’ c (ANTI*)UTINM®); 


hence 
area(AM M°) < area(ANI°) + area(IM M°) = area(ANI°). 
Since I is measurable 


area(A) > area(AM TI) + area(An I°) 
> area(AN M)+area(An M°). 


Hence WM is measurable. 


4.5.9 We already know that the intersection of a sequence of measurable 
sets is measurable. By De Morgan’s law (§1.1), M,, measurable implies the 
complement M°¢ is measurable. So 


(U us) — q M¢ 
n=1 n=1 


is measurable. So the complement ><, M, is measurable. 


4.5.10 Let P,, & = 0,...,n, denote the vertices of D/,. It is enough to 
show that the closest approach to O of the line joining P, and P,+, is at 
the midpoint M = (Py + Px41)/2, where the distance to O equals 1. Let 
0; = k/n. Then the distance squared from the midpoint to O is given by 


[cos(20;,) + cos(20,41)]? + [sin(20;,) + sin(20,41)]? 
4 cos(6;)? 
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2 + 2[cos(26;,) cos(20,41) + sin(20;,) sin(26;41)] 
4 cos?(0) 


2 + 2cos(26;) 


= —.——_ = l. 
4 cos?(61) 


Thus the distance to the midpoint is 1. To show that this is the minimum, 
check that the line segments OM and P;,P,.41 are perpendicular. 


4.5.11 Here a, = nsin(z/n)cos(a/n), and aj, = ntan(a/n). So ana, = 


n? sin?(x/n). But azn = 2nsin(m/2n)cos(r/2n) = nsin(a/n). So azn = 
\/anai,. Also 
t.%. -4 
don a, nsin(x/n) — ntan(x/n) 
cos(1/n) +1 2 cos? (1/2n) 
~ nsin(m/n) ~ On sin(7/2n) cos(m/2n) 
2 2 


~ Intan(7/2n) ab,” 


nr 


4.5.12 In the definition of measurable, replace M and A by AU B and A, 
respectively. Then AM M is replaced by A, and AN M° is replaced by B. 


4.5.13 From the previous exercise and induction, 
fore) N N 
area (U 4] > area (U 4] = s, area (A,). 
n=1 n=1 n=1 


Let N Aco to get 


area, (U 4] > > area(A,). 
n=1 n=1 


Since the reverse inequality follows from subadditivity, we are done. 
4.5.14 Note that A and B \ A are disjoint and their union is AU B. But 
B\ Aand ANB are disjoint and their union is B. So 
area (AU B) = area(A) + area(B \ A) 
= area (A) + area(B) — area(AN B). 


The general formula, the inclusion—exclusion principle, is that the area of a 
union equals the sum of the areas of the sets minus the sum of the areas of 
their double intersections plus the sum of the areas of their triple intersections 
minus the sum of the areas of their quadruple intersections, etc. 
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4.5.15 By Exercise 4.5.6, given € > 0, there is an open superset G of M@ 
satisfying area(G) < area(M) + ¢. If M is measurable and area(M) < ov, 
replace A in (4.5.4) by G to get area(G) = area(M) + area(G \ M). Hence 
area (G \ M) < e. If area(M) = 00, write M = UP, M, with area(M,) < 
oo for all n > 1. For each n > 1, choose an open superset G,, of My satisfying 
area(G,, \ M,) < €2~”. Then G = U_, G, is an open superset of M, and 
area(G\ M) < S02, area(G, \ Mn) < €. This completes the first part. 
Conversely, suppose, for all € > 0, there is an open superset G' of M satisfying 
area(G \ M) < «, and let A be arbitrary. Since G is measurable, 


area(A MM) + area(An M°) 
< area(A MG) + area(AN G*) + area(AN (G\ M)) 
< area(A) +. 


Thus area(AM M) + area(AM M°) < area(A). Since the reverse inequality 
follows by subadditivity, M is measurable. 


4.5.16 When A is a rectangle, the result is obvious (draw a picture). In fact, 
for a rectangle Q and Q’ = Q+(a,b), area(Q MQ’) > (1—¢)? area (Q). To deal 
with general A, let Q be as in Exercise 4.2.15 with a to be determined below, 
and let A’ = A+ (a,b). Then by subadditivity and translation invariance, 


area(QNQ’) < area((QN A)N(Q’N A’)) 
+ area (Q \ (QM A)) + area (Q’\ (Q'N A4’)) 
= area((QN A)N(Q’N A’)) +2-area(Q \ (Q/N A)) 
< area(AN A’) + 2-area(Q\ (QN A)). 


But, from Exercise 4.2.15 and the measurability of A, area[Q \ (QNM A)] < 
(1 — a) area (Q). Hence 


area(AN A’) > area(QN Q’) — 2(1 — a) area (Q) 


1 — €)? area (Q) — 2(1 — a) area(Q). 
Thus the result follows as soon as one chooses 2(1 — a) < (1 —€)?. 


4.5.17 Since area(AM N) = 0, 
area(A) > area(AM N°) = area(AN N) +area(AN N°); 


hence N is measurable. 


4.5.18 If area[AM (A+ (a,6))] > 0, then AN (A+ (a,6)) is nonempty. If 
(x,y) € AN(A+ (a, 6)), then (x, y) = (2’, y’) + (a,b) with (x’, y’) € A. Hence 
(a,b) € A—A. Since Exercise 4.5.16 says that area[ANM (A + (a,}))] > 0 for 
all (a,b) € Q¢, the result follows. 
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A.5 Solutions to Chapter 5 


Solutions to Exercises 5.1 


5.1.1 Let fp(x) = 1/n for alla ER. Then f°. fn(x) dx = oo for all n > 1, 
and fn(x) \y f(x) = 0. 


5.1.2 Let I, = i. fn(x) dz and I = ft f(x) dx. We have to show that I, > I. 
The lower sequence is 


gn(x) =inf{fp(x):k>n}, n>], 


Then (gn(xz)) is nonnegative and increasing to f(x), a < x < b. So the 


monotone convergence theorem applies. So Jp, = iE Gn(x) dx > fis f(a) dz = 
I. Since fn(x) > gn(x), a< a <b, In > Jn. Hence [, > J, = I. 


5.1.3 Given «x fixed, |v — n| > 1 for n large enough. Hence fo(x — n) = 0. 
Hence f(x) = 0 for n large enough. Thus f(x) = limn xoo fn(x) = 0. But, 
by translation invariance, 


[- ttoyae= [* ntoyae = f 1 -21ae= 450, 


Since ee f(x) dx = 0, here, the inequality in Fatou’s lemma is strict. 


5.1.4 By Exercise 3.2.4, (1 —t/n)"” 7 e7' as n 7 o. To take care of the 
upper limit of integration that changes with n, let 


(1— £)" 4-1, et <a, 
n(t) = 
Inlt) fi t>n. 


Then by the monotone convergence theorem, 
co 
I(2) =| is imal 
0 


= | » Fite fn(t) dt 
0 


noo 


noo 


= lim i f(t) dt 

0 
* ” t 7 x—1 
lim 1--]j} ¢ dt. 
n/Zo Jo n 
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5.1.5 By Exercise 4.4.11, 


‘ aa : rn! 
| (.-*) etdt=ne f Gage _ 
0 n 0 x: (a@+1)----- (x +n) 


For the second limit, replace x by + 1 and note n/(#+1+n) > lasn > o. 


5.1.6 Convexity of f, means that f,((1—t)a+ty) < (1-#)fn(x)+tfn(y) for 
ala<a<y<band0<t< 1. Letting n 7 oc, we obtain f((1—t)a+ty) < 
(1—t)f(a) +tf(y) for alla<a<y<band0<t< 1, which says that f is 
convex. Now, let 


Tr! 
We) =e (Sea ew) 
Then log P(x) = limy x00 fn(a) by (5.1.3) and 


d? d? 1 
etna) = $y (etn + tn (n!) -Yovwte+0)) = eras 


which is positive. Thus f, is convex. So log I" is convex. 
5.1.7 Since log (a) is convex, 

log '((1 — tha + ty) < (1 —t) log P'(x) + tlog '(y) 
for0<a<y<o,0<t< 1. Since e” is convex and increasing, 


I'((1—t)a + ty) = exp(log P'((1 — t)a + ty)) 
< exp((1 — t) log P(x) + tlog P'(y)) 
< (1 — t) exp(log P'(x)) + t exp(log P'(y)) 
= (1—t)r'(2) + tr (y) 
forO<a<y<oand0<t<1l. 


5.1.8 Use summation under the integral sign, with f,(t) =t?~te~™, n > 1. 
Then substituting s = nt, ds = ndt, 


oo aan 1 
[3 a= fo ye 1 eo” dt 
0 


=> [ gle at 
n=1"9 
CO 


Co 
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5.1.9 Use summation under the integral sign, with f,(t) =¢t®~!e-"’™!, n > 1. 
Then substituting s = n?rt, ds = n?rdt, 


7 woe? ae= f Sere a 
0 0 n=l 
ae S- [ ent mtye/2-1 dt 
n=1%9 
= 2/2, a) en gt/2 1 ds 
0) 


n=1 


= n~*/?C(x)P(a/2). 


5.1.10 By Exercise 4.4.7, 


1 co 
. 
, t?-1(—logt)"—‘ dt = 7 e 78 3"—1 ds = a. 
0 0 


gn 


Here we used the substitutions t = e~*, then rs =r. 


5.1.11 Recall that logt < 0 on (0,1) and 0 < logt < ¢ on (1,00). From the 
previous exercise, with f(t) = e~*t?~+|logt|"“?, 


foe) 1 foe) 
—tyx—-1 n-1 — 
: e- #4? Jog t| a= f fleyae+ f(t) dt 


1 co 
< | ffoge't at + | as amas aad 
0 1 
§ 
= ue +iepn—1). 
av 


5.1.12 If 2, \ 1, then k~*" 7k! for k > 1. So by the monotone conver- 
gence theorem for series, ¢(%n) = )772, k7®" 3 O72, b+ = C(1) = oo. If 
Ln + 14+, then x \y 1 (§1.5) and ¢(1) > (an) > ¢(a%). So C(an) > ¢(1) = 
oo. Thus ¢(1+) = oo. Similarly, (0+) = 


5.1.13 Since r(t) = )>>°_) te~™, use summation under the integral sign: 
1 
en tty —axty—nt 
t)dt = 7 te dt = 
f » aa 


Here we used the substitution s = (w+ n)t, and (2) = 1. 
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5.1.14 The problem, here, is that the limits of integration depend on n. 
So the monotone convergence theorem is not directly applicable. To remedy 
this, let f,(x) = f(a) if a, < x < by, and let f,(~) = 0 if a < x < ad» or 
bn < «a <b. Then fp(x) 7 f(x) (draw a picture). Hence by the monotone 
convergence theorem, 


[ P(e) de =f he ies [se Si 


5.1.15 Differentiate the log of both sides using the fundamental theorem. 


Solutions to Exercises 5.2 


5.2.1 Dividing yields 


1° — 4¢° + 5¢4 — 4? +44 


1+¢ 
Integrate over (0,1). 
5.2.2 First, for x > 0, 
2 a4 el een 
— ie —Ssx —Sszt ~f 
e (+5454...) <e >» Gay a mia a 


So with g(x) = e~(*-)*, we may use summation under the integral sign to 
get 


2n 


&S sin x eee x 
—Ssx d = as nm —Ssx 
: e oe | 2 1) Qnty’ dx 
_ ~ (=i) °° —sx,_2n 
= G@ntiy Jy eae" dx 


n=0 


1)" P(2n+1) 


I 
eae: 
RIA 


er FI)! sent 
_ so nas? 
7 Ds (2n +1) 


II 
@ 
ia 
fe} 
et 
2 
5 
—-_ 
wle 
SNe 


Here we used (3.6.4). 
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5.2.3 Since |f,(x)| < g(x) for all n > 1, taking the limit yields | f(a)| < g(z). 
Since g is integrable, so is f. 

5.2.4 Jo(x) is a power series; hence it may be differentiated term by term. The 
calculation is made simpler by noting that x[xJj(x)|' = 2? Jj (x) + vJ§(2). 
Then 


tJ, = SCY aay 
and 
Jj (x) + xJo(x) = x(xJo(x))' = Dae ” “i 
xo a (n+ 1)2a2"+2 ar 2 
gS eer carrey eee xe wae 


5.2.5 With u = sin”! and du = sinz dz, 
I, = [ow a dx = —cosxsin”” 12+ (n—1) / sin”? x cos x dz. 


Inserting cos? « = 1 — sin? z, 
I, = —cosasin™* x + (n —1)(In—2 — In). 


Solving for J, 


1 oe n—-1 
In, = —— cosxsin”—! « + ——I,_». 
n n 


5.2.6 Using the double-angle formula, 
sin x = 2 cos(x/2) sin(a/2) 
= 4cos(x/2) cos(x“/4) sin(a/4) 
= +++ = 2"cos(#/2) cos(x/4)...cos(#/2”) sin(#/2”). 
Now, let n 7 oo, and use 2” sin(x/2”) > a. 


5.2.7 The integral on the left equals n!. So the left side is 7° )(—1)”, which 
has no sum. The series on the right equals e~*. So the right side equals 
{ede = 172: 


5.2.8 Now, for x > 0, 


oe) 
sin(s 
=)5 Ee * sin( $2) 
n=1 
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and 
do e™| sin(sz) sue ~™*|s|x = |s|z/(e* — 1) = g(2), 
R=1 n=1 
which is integrable Cae: x) da = |s|I'(2)¢(2) by Exercise 5.1.8). Hence we 


may use summation ee ee integral sign to obtain 


[3 als ar=ye [re ” sin(sa) da 
0 
a 3 = + s2° 


n=1 


Here we used Exercise 4.4.9. 


5.2.9 Writing sinh(sa) = (e** — e~**)/2 and breaking the integral into two 
pieces leads to infinities. So we proceed, as in the previous exercise. For x > 0, 
use the mean value theorem to check 


my < cosha < e”. 
x 
So 
. h co 

and - - 

> e~"*| sinh(sx)| < S~ |s|xe~"*el** = g(x), 

n=1 n=1 
which is integrable when |s| < 1 ( (Jor 9 x) dz = |s| 0°, I'(2)/(n — |s|)?). 


Using summation under the integral i we obtain 


co * h 
| ee i= ™ sinh(sa) da 
(a) et —]1 


sl mea ( er = e °") dx 


Q 
8 
| 
8 
SY 
ay 


Note 


\| 
aa iM it i 
ite 
3 
| 
WH 
a 
+] rR 
vay 
See” 


I 
Ms. 
3, 
+ H 
ae 


3 
Il 
un 
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5.2.10 Clearly s, < 7 for it is enough to show the nth tail is < 1/4(n + 
1)716"+1. Now 


1 I | 
8k+1 8k+p ~ 64k?" 
Applying this with p = 4,5,6, the nth tail is less than 


2(4-—1)+ (5-1)4+ (6-1) 
64(n + 1)?16"+1 


1 1 


TEER 21 Gn+1° 
as 16 4(n + 1)716 


5.2.11 This follows immediately from Exercise 1.3.23. Here is python code: 
from fractions import Fraction 
f = Fraction 


def partialsum(n): 


p= 0 

for k in range(0,n+1): 
s = £(4,8*k+1) - £(2,8*k+4) 
s -= £(1,8«*k+5) + £(1,8«k+6) 
s x= £(1,16«*xk) 
pt=s 

return p 


def tail(n): 
return £(1,4*16«* (nm+1) * (n+1) **2) 


def cf(r): 
n = r.numerator 
d = r.denominator 
if nd == 0: 
return [int (n/d) ] 
else: 
return [int (n//d)] + cf£(£(d,n%d) ) 


def CF(r): 
n = r.numerator 
d = r.denominator 


if nd == 0: 

return int (n/d) 
elif n < ds: 

return cf (f(d,n%d) ) 
else: 


return str(int(n//d)) + ' + ' + str(cf(£(d,n%d))) 
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for n in range(0,6): 
e = tail(n) 
S = partialsum(n) 
S=s+te 
print n,"\n",s,"\n",S,"\n",CF(s),"\n",CF(S),"\n\n" 


5.2.12 We have to show that x, — x implies J,(x,) > JL(x). But g(t) =1 
is integrable over (0,7) and dominates the integrands below. So we can apply 
the dominated convergence theorem, 


1 /* 1 /* 
Ile) = = i cos(vt — x, sint) dt > = | cos(vt — xsint) dt = J, (a). 
0 0 


5.2.13 It is enough to show that w is continuous on (a, co) for all a > 0, for 
then 7 is continuous on (0,00). We have to show that x, — x > a implies 
W(an) 2 U(x). But x, > a, n > 1, implies e-K’man < e-kta E> 1 n> 1, 
and )> gx = >> e~*™* < oo. So the dominated convergence theorem for series 
applies, and 


Mi 


2 = 2 
e* Tin _» se TE w(x). 
k=1 


5.2.14 Set fr(a) = fn(z) if an < x < bp, and f(x) =Oifa<xr<a<a, 
or by < a < b. Then |fn(x)| < g(x) on (a,b), and f,(%) > f(x) for any x in 
(a, b). Hence by the dominated convergence theorem, 


i: falnjdz = [ f(a) dx > [ f(a) da. 


5.2.15 Use the Taylor series for cos: 


1 Tw co an wT 
Jo(a) = =f cos(x sin t) dt = So(-1)" car f sin?” ¢ dt. 


n=0 


1 ¢* 2 fhe 2 
= ‘ sin?” tdt = — / sin?” tdt = —Ion 
Tv 0 T 0 Tv 


(2n —1)-(2n—3)----- 1 (2n)! 


But 
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Inserting this in the previous expression, one obtains the series for Jo(x). 
Here for x fixed, we used summation under the integral sign with g(t) = e”. 
Since {> g(t) dt = ex, this applies. 


5.2.16 By Exercise 4.4.22, 


ei / at at | >a 
n Zoo 0 t 1 t 


1 1 
= lim (1454-2 -togn) 
2 n 


For the second part, since (1 —t/n)" > e~', we obtain the stated formula by 
switching the limits and the integrals. To justify the switching, by the mean 
value theorem with f(t) = (1—t/n)", 


So we may choose g(t) = 1 for the first integral. Since (1 — t/n)" < e~', we 
may choose g(t) = e~'/t for the second integral. 


5.2.17 Use Euler’s continued fraction formula with agp = x, ay = —2?/3, 
ag = —2?/5, a3 = —2?/7,.. oe 


Solutions to Exercises 5.3 


5.3.1 By convexity of e”, 
at tpt = e(t-t) log att log b < (1 _ tele? + telos = (1 _ ta ie th. 


Thus 0 <b <0! <a’ <a. The rest follows as in 85.3. 


5.3.2 If a, > a and b, > b with a > b > 0, then there isac>0 witha, >c 
and 6, > c for all n > 1. Hence 


i a a es 


\/ a2 cos? 6 + b2 sin? 6 ~ Vc2cos?@+c2sin?@ © 
Hence we may apply the dominated convergence theorem with g(0) = 2/cz. 


5.3.3 Note the map 0 + 2’ is smooth on (0,7/2) with range in (—1,1). 
Since arccos : (—1,1) > (0,7) is smooth, the map G(0) = 6’ = arccos(z’) is 
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well defined and smooth on (0, 7/2). From the proof, we know G’(@) = 2b/A > 
0, so G is strictly increasing. Since G(0+) = G(0) = 0 and G(m/2—) = 
G(/2) = 7, the result follows. 


5.3.4 bdt = b? sec? 6 dO = b?(1+tan? 6) d@ = (b? +t?) dO. Moreover a? cos? 6+ 
b? sin? 6 = cos? O(a? + t?) and b? sec? 6 = b? + b? tan? 6 = b? + t?. Thus 


az +f? 


a? cos? 6 + b? sin? 6 = b? . Fae 


and the result follows. 


5.3.5 Since the arithmetic and geometric means of a=1+a andb=1-2 


are (1, V1 — 27), M(1+2,1—2) = M(1, V1-— 27). So the result follows from 


cos? @ + (1 — 2”) sin? 6 = 1 — x? sin” @. 
5.3.6 By the binomial theorem, 


1 = -1/2 
————————— yer/ / = sin?” 6. 
Vl—a?sin?6 7a a 


By Exercise 3.5.10, (~1/?) = (—1)"4-"(2"). So this series is positive. Hence 
we may apply summation under the integral sign. From Exercise 5.2.15, 
Tan, = (2/7) fy 7/2 inn 9 dQ = 4-"(°"). Integrating the series term by term, 
we get the secill, 

5.3.7 With t = a/s, dt = —ads/s?, and f(t) = 1/,/(1+#?)(a?2 +#), 
f(t) dt = —f(s)ds. So 


M(i,a) = = NESE ——- 


-2/" FO ae+= [Hoa 
| ae 


~ if Ty ESOIEENa SET 
1/Jfe dr 
7 Jo (1+ (ar)2)0+r2)_ 


For the last integral, we used t = ar, dt = xdr. 


5.3.8 The AGM iteration yields (1+ a,1—) 4 (1,2’) 4 ((1+2’)/2, Va’). 
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5.3.9 Now, « < M(1,x) <1, anda’ < M(1,2’) <1. So 


1 1 | 1-M(,2’) 
M(l,z) Q@)| = M(1,2) | 
eg 1-2” eee 
"a wt+2’) 1l+a~” 


5.3.10 We already know that 


Q (=) = 4Q(2). 


1+’ 


Substitute « = 2\/y/(1+y). Then 2’ = (1—y)/(1+y). So solving for y yields 
y=(1—2')/(1+2'). 


5.3.11 From the integral formula, M(1, 2) is strictly increasing and continu- 
ous. So M(1, 2’) is strictly decreasing and continuous. So Q() is strictly in- 
creasing and continuous. Moreover, « — 0 implies x’ — 1 implies M(1, 2) > 0 
and M(1,2’) > 1, which implies Q(x) > 0. Thus Q(0+) = 0. If x > 1-, 
then 2’ —+ 0+. Hence M(1,x2) > 1 and M(1,2’) > 0+. So Q(a) > oo. 
Thus Q(1—) = o«; hence M(1,-) : (0,1) > (0,1) and Q: (0,1) > (0,00) are 
strictly increasing bijections. 


5.3.12 M(a,b) = 1 is equivalent to M(1,b/a) = 1/a which is uniquely solv- 
able for b/a, hence for b by the previous exercise. 


5.3.13 Let « = b/a = f(a)/a. Then the stated asymptotic equality is equiv- 
alent to - 
Jim, [log(x/4) + | =0. 


Since 0 < b < 1, a > o implies x > 0 and M(1,x) = M(1,b/a) = 1/a by 
homogeneity, this follows from (5.3.7). 

5.3.14 By multiplying out the d factors in the product, the only terms with 
x?! are at, 1 <j < d; hence dp, = a, +-::+ aq; hence p, is the 
arithmetic mean. If « = 0 is inserted, the identity reduces to aja2...aq = Da. 
If ay a2 vee Qd 1, the identity reduces to the binomial theorem; 
hence px(1,1,...,1) = 1, 1 < k < d. The arithmetic and geometric mean 
inequality is then an immediate consequence of Exercise 3.3.28. 


5.3.15 Since ay > ag > -:: > ag > O, replacing ag,...,a@q_1 by a, in py 
increases p,. Similarly, replacing a2,...,a@q—1 by aq in pg decreases pg. Thus 


/ 


2g ME Oe 2 a 
a,~ d(ayaq...aaaa)'/4 a Ga)’ 


where fg is as in Exercise 3.2.10. The result follows. 
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5.3.16 Note by Exercise 3.3.28, (a4,...,a/)) = (p1,... py") implies a, > 
a > ++: > al, > 0. Now aj is the arithmetic mean and aq, is the largest; 


hence a; > a). Also a’, is the geometric mean and ag is the smallest; hence 
1 d ’ 


aq < a’,. Hence (a”) is decreasing and (a) is increasing and thus both 


sequences converge to limits a1. > a%. If we set In = la”, al”), we conclude 
the intervals [, are nested I) D Ig D--- D [a%,a1,], and, for all n > 0, the 


reals al”)... - a” all lie in I,,. By applying the inequality in Exercise 5.3.15 


repeatedly, we paconclide 


Letting n — oo, we conclude a7 = a,,. Denoting this common value by m, 


we conclude a”) + mas n> oo, for all 1 <7 <d. The last identity follows 


from the fact that the limit of a sequence is unchanged if the first term of 
the sequence is discarded. 


Solutions to Exercises 5.4 
5.4.1 If « = V2t, dr = dt/V/2t, and t = 27/2. So 


\3- [ ernie =-[© aAJ v2 ple? 


Hence (1/2)! = I°(3/2) = (1/2)P°(1/2) = V7/2. 


5.4.2 Since (x — s)? = 2? — 2rs + 8?, 
e7* /?(s) aa / eW (@—9)°/2 dy = / en” 2 de = VI 


by translation invariance. 


5.4.3 By differentiation under the integral sign, 


L)(s) =, et gre ®/? de, 


Le")(0) = / or e—®"/2 dey. 


To justify this, note that for |s| <b and f(s,x) = e8*-*"/?, 


So 
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n 
= ese? /2 », |a|* 
k=0 


Ma k 
2 x 
|,su—x" /2 | | 
< nie y aE 
k=0 


(b+1)|e|-«?/2 


gant (5.2) 


< nle = g(x) 


and g is even and integrable (f°~. g(x) dx < 2n!L(b+1)). Since the integrand 


is odd for n odd, L‘” (0) = 0 for n odd. On the other hand, the exponential 
series yields 


3 ical g2n oY g2n 
L(s) = V2ne® /? = V2 = 5° Le) (0) —_. 
(s) ap n pe 2”n! dX mu (2n)! 
Solving for L@”)(0), we obtain the result. 
5.4.4 With f(s, x) =e-*/? cos(sz), 


= e~* /?(| cos(sx)| + || sin(sz)]) 


Foal + [20.0 


<e* (1 + |x|) = g(a), 


which is integrable since f°. g(x) dx = 2m + 2. Thus with u = sin(sa) and 


dv = —xe~®’/2 dx, v = e~®/?, du = scos(sx) dx. So 
F'(s) = -{ e~®/2y sin(sa) da 


Integrating F’(s)/F(s) = —s over (0,s) yields log F(s) = —s*/2 + log F(0) 
or F(s) = F(0)e~*/2. 


5.4.5 With f(a,2) = e-*-°/*/,/e and a >e>0, 


1 1 
ee ee 
ai" (tas) aaa. 


—x 


ée"; z>l. 


aay) +] 2 fae) 


But the expression on the right is integrable over (0,00). Hence we may 
differentiate under the integral sign on (€,0o), hence on (0,00). Thus with 
x=a/t, 
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= dx 
H’ — —s—afa We 
(a) [ “ aE 


a dt 1 
— —a/t—t Se = —_—=H . 
[ a 


Integrating H’(a)/H(a) = —1/,/a, we get log H(a) = —2,/a + log H(0) or 
H(a) = H(0)e~2v°. 


5.4.6 Let x = y,/g. Then dz = \/qdy. Hence 


i en? /24 dy — vi | ew dy = \/2rq. 


5.4.7 Inserting g(x) = e-®™ and 6 = Vt in Exercise 4.3.8 and 7 = 1/2q in 
the previous exercise yields 


: - —a?n _ i _ 1 
im vivo = [ e dx = 5/2ng = 5. 


5.4.8 It is enough to show that ¢ is smooth on (a,oo) for any a > 1. Use 
differentiation N times under the summation sign to get 


CO 


yk 
CCF) (s) - oo. s>a,N>k>0. 


ns 


To justify this, let f,(s) =n7°,n >1, s> 1. Since logn/n* > 0asn 7, 
for any € > 0, the sequence (logn/n‘) is bounded, which means that there is 
a constant C, > 0, such that |logn| < C.n* for all n > 1. Hence 


N 
SO) 
k=0 


Then if we choose € small enough, so that a— Ne > 1, the dominating series 
> gn = C>l nN** converges. 


5.4.9 Again, we show that w is smooth on (a, 00) for all a > 0. Use differen- 
tiation N times under the summation sign to get 


p(t) =So(- 1)Prtnre-™t, = - tt >a, N>k>0. 


n=1 


To justify this, let f,(t) = e""™, n > 1, t > a. Since eN+le-* - 0 as 
x — 00, the function z+te-* is bounded. Thus there is a constant Cy > 0, 
such that eV e~* < Cn/zx for x > 0. Inserting 2 = n?rt, t > a, 
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k=0 


Then the dominating series 7 gn = )>(N +1)Cn /raN*1n? converges. 


5.4.10 Differentiating under the integral sign leads only to bounded functions 
of t. So J, is smooth. Computing, we get 


1 Tv 
Ji(x2) == | sint sin(vt — x sin t) dt 
0 


TT 
and 
1 Tv 
JN (a) = -< / sin? t cos(vt — a sint) dt. 
T Jo 
Now, integrate by parts with u = —xcost — v, dv = (v — xcost) cos(vt — 


xsint) dt, du = xsintdt, and v = sin(vt — xsint): 


1 Tv 
a? Jy (x) + (2? —v?)J,(a) = ~ | (a? cos” t — v”) cos(vt — x sint) dt 
0 


T 
1 [7 se i Oe 
~/ udv = —uv -</ v du 
wT Jo nT |g Jo 


1 Tv 
= | xsint sin(vt — x sint) dt 
T JO 


= —2J)(x). 
Here v must be an integer to make the uv term vanish at 7. 


5.4.11 Differentiating under the integral sign, 
F(™(s) =| areste—F() da, 
Since |z|" < nlel*!, with h(s, xr) = e&*e-f@) and |s| < b, 


Orh 
= Os” 


N 
~ S- |x\"esltle“ fF) < (N + 1NleOtDIzI-F(@) — g(x) 


and g is integrable by Exercise 4.3.11. This shows that F' is smooth. Differ- 
entiating twice, 
F"(s)F(s) — F'(s)* 
log F ee nr SO 
log F(s)]" = 


Now, use the Cauchy—Schwarz inequality (Exercise 4.4.21) with the functions 
e(se—F(2))/2 and xels*-f(#))/2 to get F"(s)F(s) > F’(s)*. Hence [log F(s)]|"" 
0, or log F'(s) is convex. 
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5.4.12 With u = e~* and dv = sina(dz/x), du = —se~** dx, and v = 
F(t) = fi sinr(dr/r). So integration by parts yields the first equation. Now, 
change variables y = sx, sdx = dy in the integral on the right yielding 


b : b/s 
| e —— dx = —e~*F(b) +/ e YF (y/s) dy. 
0 ) 


Let b > oo. Since F is bounded, 


i ow a= | e YF (y/s) dy. 
0 


zt 0 


Now, let s — 0+, and use the dominated convergence theorem. Since 
F(y/s) > F(oo), as s + 04, for all y > 0, 


CO 


lim e€ 
s—0+ 0 x 


_ 7 SIN @ 


dx = | e YF (oo) dy = F(co) = lim 
0 
But, from the text, the left side is 


. 1 T 
lim arctan | — } = arctan(oo) = -. 
20+ x 2 


Solutions to Exercises 5.5 


5.5.1 Without loss of generality, assume that a = max(a, b,c). Then (b/a)” < 
1 and (c/a)" <1. So 
fin (a® +b" + c%)/" 4 lim (1 + (b/a)” + (c/a)")/” =a. 

For the second part, replace a, b, and c in the first part by e%, e?, and 
e°. Then take the log. For the third part, given e > 0, for all but finitely 
many n > 1, we have log(an) < (A+ €)n or an < e™At®. Similarly, bn < 
eMBte) 6 < e™C+) for all but finitely many n > 1. Hence the upper limit 
of log(a@n + bn + Cn)/n is < max(A, B,C) + €. Similarly, the lower limit of 
log(@n+bpn+cn)/n > max(A, B,C)—e. Since ¢ is arbitrary, the result follows. 


5.5.2 The relative error is about .083%; here is python code: 
from math import exp, log, atan2, factorial 


def stirling(n): 
p = atan2(1,1)+*4 
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s = (n + .5)*log(n) - n 
s = exp(s) 

S *= (2 x p)xx(.5) 
return s 


£ = float (factorial (100) ) 
s = stirling(100) 
e = 100*(f-s)/f£ 


prank “100. = pba" 
print "stirling = ",s,"\n" 

rint "percentage error = ",e,"\n" 
p p g 


5.5.3 In the asymptotic for (/'), replace n by 2n and k by n. Since t = n/2n = 
1/2 and H(1/2) = 0, we get 1/,/mn. 


5.5.4 Straight computation. 


5.5.5 H’(t, p) = log(t/p) — log|(1 — t)/(1 — p)] equals zero when t = p. Since 
A" (t,p) =1/t+1/(1—-t), A is convex. So t = p is a global minimum. 


5.5.6 Straight computation. 
5.5.7 Since (q”)” = e"® 84, the limit is 
sup{x* logg:a< ax <b} =a’ logg, 
by the theorem. Here log g < 0. 
5.5.8 Since '(s + 1) = sI(s), 


_ gpotsL(s + P(e +14 1/3)F(s +1 42/3) 
ns Coe) rn 
_ 39s(s+1/3)(8+2/3) a) 

= BstDGstias 19) =F6)- 


f(s+1) 


Inserting the asymptotic for (s+) yields 27/3 for the limit. The general 
case is similar. 


5.5.9 Take the log and divide by n to get 


n—1 


~ Slog P(s + h/n) = 


k=0 


7 — log(2m) + log P(ns) — (ns — 1/2). S*. 
mr 


nr 


Using Stirling’s approximation for log I'(ns), the result follows. 
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5.5.10 Here 


Ln(ny) = f er(ty-F(@)) dy, 
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and g(y) = max{xy — f(x) : « € R} exists by Exercise 2.3.20 and the sup is 
attained at some c. Fix y € R and select M > 0 such that —(M + y) < g(y) 
and M+y > 0. Since f(x)/|x| > oo as |x| > co, we can choose b such that 
b>1,b>c, and f(x) > Mz for x > b. Similarly, we can choose a such that 


a<-—l,a<c,and f(x) > M(-—z) for x < a. Write L,(ns) = I, 4 
foot ie + f,*. The second theorem in §5.5 applies to I; hence 


1 
lim —log(I°) = max{ry — f(x): a< zx <b} =g(y) 
n->0o nN 


since the max over R is attained within (a,b). Now 


a n(M+y)a 
is fo ene ae = ee 
cae a n(M +y)’ 
so ; 
jim, — log, ) < (M+ y)a < —(M +4) < g(y). 
Similarly, 
oo —n(M—y)b 
fig ) en (Ml —-y)a dv = Ss 
~ Jb n(M — y) 
so 


1 
dim = log(In) < —(M — y)b < —(M — y) < gly). 


By Exercise 5.5.1, we conclude that 


pIO4+ 0 = 


1 1 
lim —logL,(ny) = lim — log (7 + 12 + It) =g(y). 
nfo n 


nZo n 


5.5.11 The log of the duplication formula is 


2s log 2 + log I'(s) + log I'(s + 1/2) — log '(2s) = log(2V/7). 


Differentiating, 


I(s) | P(s+1/2)_ T's) 


OEE Fr Tle hia) Fes) 


Inserting s = 1/2, we obtain the result. 
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5.5.12 Insert s = 1/4 in the duplication formula to get 


PO/4)L (3/4) _ 
Me aaa oe 


Now recall that (1/2) = zx. To obtain the formula for 1/M(1,1/vV2), 
replace ’(3/4) in the formula in the text by m/2/I"(1/4). 
Solutions to Exercises 5.6 


5.6.1 Using (5.6.9), replacing x by 2/27, and setting Ao, = ¢(2n)/(27)?”, 
we have 


1 
i — = 145 4+ 2Aom’ + 2Aga* + 2Aga® +...; 
—e x 


thus 


x? x? xt xv 
= —-—+—-—4... 14+ 4 2Aox? — 2Ayr4*+2A Spits) 
x € 2° % a4 i Voge Qu 4X 6x 


Multiplying out yields Ay = 1/24, Ay = 1/1440, Ag = 1/60480, Ag = 
2419200; hence ¢(2) = 12/6, ¢(4) = 74/90, ¢(6) = 7°/945, and ¢(8) 
78/9450. 


5.6.2 Let by, = B,/k!, and suppose that |b,| < 2” for k < n—2. Then (5.6.8) 
reads 


n—1 bj,(—1)?-1-* 


=0 
= ! 
= (n—k)! 
which implies (n! > 2”~1) 
n—2 n—2 - 
[bx.| 2* 
~al< < 
[bn al UG, k)! 2 Gee 
k=0 k=0 
n—2 k 
n-1 
= ~ Qn k-1 = 2 
k=0 


Thus |b,,| < 2” for all n > 1 by induction. Hence the radius of convergence 
by the root test is at least 1/2. Also from the formula for ¢(2n) > 0, the 
Bernoulli numbers are alternating. 


5.6.3 The left inequality follows from (5.6.11) since b} = }°°°_, an. For the 
right inequality, use 1+ a, < e®". So 


co 


[4 +0) < [Le =e (So) | 


n=1 n=1 
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5.6.4 From (5.1.3), 


T(s) = i nn! 
7) Poo 1+ 2)(2+2)...(n+ 2) 
1. et(logn—1—1/2—---—-1/n)erex/2_ ex/n 


ith, = — ee 

anZoo (1+a)(1+2/2)...(1+2/n) 
env oo er/n 

7 i n=1 (; 7 ) 


For the second product, use (5.1.4) instead. 


5.6.5 For 0 < a < 1 use (5.1.3) with 2 and 1 — & replacing xz. Then 
I'(a)P(1 — x) equals 


: n?nin'-*n! 
i. gepete). net —2e-e. rion 
. n(n!)? 
one (i —22)(4—22)...(n2?—22)(n+1—a) 


~ ; ae (1 — x2)(1 — 22/4)...(1 — 22/n2)(1 + (1— 2)/n) 


= Tl (1 = =| = Ge 


5.6.6 The series B(x) is the alternating version of the Bernoulli series (5.6.7), 
and the Taylor series for sin(x/2) is the alternating version of the Taylor series 
for sinh(a/2). But the Bernoulli series times sinh(«#/2) equals (#/2) cosh(a/2). 
Hence (Exercise 1.7.7), 


B(x) sin(x/2) = (a/2) cos(x/2). 


Dividing by sin(a/2), we obtain the series for (2/2) cot(x/2). 


5.6.7 If 6 > 1, then B(x) would converge at 27. But (a#/2) cot(#/2) = B(x) 
is infinite at 27. 


5.6.8 Taking the log of (5.6.14), 


log|sin(r2z)] — log(72) ) = Yo toe (1 - =). 


Differentiating under the summation sign, 


1 GQ 2@& 
t —-l= 5 — 
m cot(r2) ’ ae 


n=1 
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To justify this, let f(a) = log(1 — #?/n?) and let |z| < b < 1. Then log(1 — 


th=t+t?7/24+8/34+---<t+P4+04+---=t/(1-t). So 
x? 2|2| b? + 2b 
|fn(x)| + | fp (2)| = n2 — 72 + n2 — 72 < n2 — b2 = 9n; 


which is summable. Since this is true for all b < 1, the equality is valid for 
jx] <1. 


5.6.9 By Exercise 3.6.13, cot x — 2cot(2x) = tana. Then the series for 
tan x follows from the series for cot z in Exercise 5.6.6 applied to cot x and 
2 cot(2z). 


5.6.10 By Exercise 5.6.4, 


ae x 

log P(x) = —yx-1 [= — tog (1+ =)]. 

og I'(z) = —yx a= = log (1 
Differentiating, we obtain the result. To justify this, let fp(~) = x/n—log(1+ 
x/n), fi, (a) =1/n—1/(a4+n). Then t —log(1+t) = t?/2—t?/3+--+ < t?/2 
for t > 0. Hence f,(x) < x?/n?. So 

‘ b?+b 
| fn (a) + |fn(@)| < = 3- = gn, 


nm 


which is summable, when 0 < x < b. Since 0 is arbitrary, the result is valid 
for x > 0. For the second identity, use the infinite product for x! instead. 


5.6.11 From the previous exercise, 


d 
gloat) =—9+ 0 (5-s55) =-74 eer n(n+ 2) 


Differentiating r times, we obtain the identity. To justify this differentiation 
under the summation sign, let h(w) = «/(n(n + x)) and consider —1+.€ < 
x < 1/e for € small. Then 


|a| 1 r} 


e+ WO) +--+ 1K) < + ooo tt a 


When n = 1, this is no larger than (r + 1)!/e"t+. When n > 2, this is no 
larger than (r + 1)!/e(n — 1)?. Since this is the general term of a convergent 
series, the identity is justified on [—1+ €, 1/e]. Since € > 0 is arbitrarily small, 
the identity follows. The inequality follows by separating out the first term. 


5.6.12 Inserting x = 0 in the previous exercise yields 
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This yields the coefficients of the Taylor series. To show the series converges to 
the function, note the integral form of the remainder in Taylor’s theorem 84.4 
and use the inequality in the previous exercise. You get (here —1 < x < 1) 


| gt tl 


aes eC) 


=a </ oper +60) (1 —s)"ds. 


sx +1)rtt 


But forl >a >-landO<s<1,1—s<1+4+s2,s0 (l—s)/(1+sz) <1, 
hence (1 — s)"/(1+ sx)" > 0 as r + oo. Thus the integral goes to zero as 
r —> oo by the dominated convergence theorem. For the second series, plug 
x = 1 into the first. 


5.6.13 Inserting x = 1 in the series for I’(x)/I'(a) yields a telescoping series. 
So we get I’(1) = —7. Inserting x = 2 yields 
1 


one | 
17 9 


1 
Bee 


1 1 ii, “4 
2. 4° 3° 5 


ca eaee Se ee, eee ye 
+a-6T . 


1 
3 
Since I’ is strictly convex, this forces the min to lie in (1, 2). 


5.6.14 Differentiate the series in Exercise 5.6.10, and compare with the series 
in Exercise 5.1.13. Here on 0 < x < b, we may take g, = (b+1)/n?,n>1. 


5.6.15 Substituting (1 — 2)/2+>5 «, we see that the stated equality is equiv- 


alent to 
I(x) 1 


Zone Fe x 


Move the 1/z to the left in Exercise 5.6.10, and then take the limit x > 0+. 
Under this limit, the series collapses to zero by the dominated convergence 
theorem for series (here, gn = b/n? for 0 < x <b). 


Solutions to Exercises 5.7 


5.7.1 6 is strictly increasing since it is the sum of strictly increasing monomi- 
als. Since 09 is continuous and 69(0+) = 1, 00(1—) = 00, A9((0,1)) = (1, 00). 
Similarly, for 6+. 


5.7.2 Multiply (5.7.21) by s0o(s)? = 62(1/s). You get (5.7.22). 


5.7.3 The AGM of 62(q) and 6? (q) equals 1: M(63(q), 6 (q)) = 1 for0<q< 
1. Since Oo is strictly increasing, 6? is strictly increasing. This forces 67. to be 
strictly decreasing. Moreover, @_ (0+) = 1. Hence 6? (0+) = 1, and q > 1— 
implies M(oo, 0? (1—)) = 1 or 6? (1—) = 0. Thus 6? maps (0,1) onto (0, 1). 
Since 6_ is continuous and 6_ (0+) = 1, we also have 6_ strictly decreasing 
and 6_((0,1)) = (0,1). 
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5.7.4 Since o(6) = o(7) = 0 and a(2n) = a(n), o(12) = o(14) = 0. Also 
a(13) = 8 since 


13 = (£2)? + (+3)? = (+3)? + (+2). 


To show that o(4n — 1) = 0, we show that 4n — 1 = i* + j? cannot happen. 
Note that 4n — 1 is odd when exactly one of 2 or 7 is odd and the other is 
even. Say i = 2k and 7 = 2€+ 1. Then 


4n —1= 4k? +407 44041=4(? +042) 41, 
an impossibility. Hence a(4n — 1) = 0, so o(11) = o(15) = 0. 
5.7.5 Let m = M(a,b), a’ = a/m, b' = b/m. Then b/a = b'/a’ and 
M(a’,b') = 1, so (a’, b’) = (62 (q), 62. (q)). Hence 


a —b = m(a’ —b') =m (62 (q) — 62(q)) =2M(a,6) S> o(n)q” 
n odd 


= 8M(a,b)qx (1+ 2q¢+¢°+...). 
Replacing q by q?’ yields the result. 


5.7.6 Let fr(t,x) = enn nt cos(nz), n > 1. Then Of, /Ot = —n?7fn, and 

0? f,/Ox? = —n* fr, n > 1. Thus to obtain the heat equation, we need only 

to justify differentiation under the summation sign. But, for t > 2a > 0, 
Ofn Ofn| . |Ofn 


fl +) Se] ++) Se| +t | eae 


= 
< An? re 2 = Yn; 


: 


which is summable since xe~** is bounded for x > 0; hence xe~ 2%” < e~®. 


Solutions to Exercises 5.8 


5.8.1 Assume that 1 < a < 2. The integrand f(z, t) is positive, thus increas- 
ing; hence < f(2,t). By Exercise 3.5.7, f(2,t) is asymptotically equal to ¢/2, 
as t — 0+ which is bounded. Also the integrand is asymptotically equal to 
tet, as t > oo which is integrable. Since f(2,t) is continuous, g(t) = f(2,t) 
is integrable. Hence we may switch the limit with the integral. 


5.8.2 The limit of the left side of (5.8.4) is y, by (5.8.3), and 7'(1) = 1. 


5.8.3 Since P(a +n4+1) = (x44 ny) (a+n) =(a4+n)\(a+n-1)...2I (2), 
for  € (-n—1,-—n)U (—n,—n + 1), 


I(a+n+1) 


ak aaa oar ae oa 


Letting « + —n, we get (x + n)I'(x) > (-1)"/n!. 
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5.8.4 For t > 1, 


ett 


foe) 
<> /e nt __ <ce™ 
~ Let — 


with c = 1/(1 — e"7). Now, the integrand f(z,t) in (5.8.9) is a smooth 
function of x with 0" f/0x” continuous in (x,t) for all n > 0. Moreover, for 
b>1and0< 2 < band each n > 0, 


Ory 
Ox” 


< v2" loge|” [40-4 8? 
< ce7™ 2-41) log t|"2?/? = gn(t), 


which is integrable over (1, oo). Hence we may repeatedly apply differentiation 
under the integral sign to conclude that the integral in (5.8.9) is smooth. 


5.8.5 Inserting x = 2n in (5.8.10), we get 


a "DE (n)¢(2n) 
n-1/20(—n + 1/2) 
= ‘eal — 1)! 
(—n + 1/2)(—n + 3/2)...(—n +n — 1/2) 
x "Ton ta tify 
(=1)° +332" Bon 


(2n)! ~ On" 


¢(1 — 2n) 


5.8.6 Here f(x,t) = (1+ [t]—-t)/t?*1, and I(x) = f° f(z, t) dt, x > 1. Since, 
forb>a>a>1, 


t ae 
yeni | 
and g is integrable, we may differentiate under the integral sign, obtaining 
I'(a) = (© + 1)I(a + 1) for a < & < b. Since a < BD are arbitrary, this is 
valid for z > 0. Inserting x = 2 in (5.8.6) yields 7?/6 = ¢(2) = 1 + 21(2) or 
I(2) = 72/12 — 1/2. 


5.8.7 The right side of (5.8.9) is smooth, except at « = 0, 1. Hence (x—1)¢(a) 
is smooth, except (possibly) at « = 0. By (5.8.6), 


log((a — 1)¢(a)) = log (1 +2£(x2—- 1M(e)) : 
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So 


ae a he - (me (a 2) a oC 


Taking the limit « + 1, we approach J(1) = 7. 


5.8.8 (5.8.10) says that 1~*/?P'(a/2)¢(z) = a -®)/2 (1 — x) /2)¢(1 — 2). 
Differentiating the log of (5.8.10) yields 


1 1I"(z/2) | C(z) _ 1 Li"((l—2)/2) ol =2) 
— 98" + 5 Fe /2) * Cay ~— 2°" 27 2)/2) C2)" 


Now, add 1/(# — 1) to both sides, and take the limit « — 1. By the pre- 
vious exercise, the left side becomes — loga/2 + I’(1/2)/2P'(1/2) + y. By 
Exercise 5.6.15, the right side becomes loga/2 + y/2 — ¢’(0)/¢(0). But 
¢(0) = —1/2, and by Exercise 5.5.11 and Exercise 5.6.13, 


1r(a/2) _ 1r"() 


= — log2 
2TGU/2) 2rh © 
Y 
= —~ — log2. 
ao 
So ; 
Oo og 
en + ( af log2) +7 = et ba/2 4200). 
2 2 2 
Hence ¢’(0) = — log(27)/2. 
5.8.9 For0<a< 1/2, 
a. oP Be 8 a 
—log(1-a) =a+—+—+4+---<ata*t+a’t+:--= < 2a. 
2 3 l-a 
On the other hand, by the triangle inequality, for |a| < 1/2, 
a 0 
jetta) g SS 
| tog(t ~ a) —a) =|E 45+... 
ja|? lal? 
< Led 
= 2 3 
1 1 a? 
< = (a? P deigs |) Soe <2 
<5 (lal + jal? + ) 5 T=] =" 


5.8.10 m and n are both odd iff mn is odd. So y4(m) and x4(n) are both 
equal 1 iff y;(mn) = 1. Since y+ equals to 0 or 1, this shows that .4(mn) = 
x+(m)x+(n). For y_, m or n is even iff mn is even. So 


x—(mn) = x-(m)x-(n) (A.5.1) 
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when either n or m is even. If n and m are both odd and m = 4i + 8, 
n = 4j+3, then mn = (4¢4+3)(47 +3) = 4(4¢7 + 37 +37 4 2) +1 which derives 
(A.5.1) when y_(m) = y_(n) = —1. The other three cases are similar. 


5.8.11 With f,(z) = 1/(4n — 3)" —1/(4n— 1)", 


= 3 fnla), x>0. 
n=1 


Then by the mean value theorem, f(x) < «/(4n — 3)*+1. Hence |f,(ax)| < 
b/(4n — 3)°*! = gn, for0 << a<a <b. Since 4n—3>n, Yo gn < b¢(a +1). 
So the dominated convergence theorem applies. 


Solutions to Exercises 5.9 


5.9.1 Onn-1l<a<n, o(#) = Oyo f(z — 4k), so a(n) = Dy f(n — &). 
Onn <a <ntl, a) = Ort fle—k) = Tg f(e—h) + fle@—n—D), 
so q(n) = Yp_9 f(r — k) + f(—1). Since f(—1) = 0, ¢ is well defined at all 
integers. Ifn—1 <a <n, q(x) = peg f(a—2n), and g(a—1) = pg f(a@—- 
1k) = OM! fw — &). So a(x) — a — 1) = fle) — fle —n—1) = f(a) 
since f = 0 on [—2, —1]. Thus q solves (5.9.3). To show that g is smooth on 
R, assume that g is smooth on (—oo,n). Then q(x — 1) + f(z) is smooth on 
(—oo,n +1). Hence so is g(a) by (5.9.3). Thus ¢ is smooth on R. 


5.9.2 If f(a) =0 for x > 1, the formula reads 


=flma- 1), z>-—l, 
_ fet) =f ee 2) -1>2>-2, 
We) =) p@41)—fle+2)-f@+3), —2>e>-3, 
and so on. 


5.9.3 We show, by induction on n > 0, that 


eo” 


c(D) [e**x"] = aan 


[c(a)e**] (A.5.2) 
for all |a| < R and convergent series c(a) on (—R, R). Clearly, this is so for 
n = 0. Assume that (A.5.2) is true for n — 1 and check (by induction over 
k) that D*(e% 2") = 2D*(e% a2"—-1) + kD*-1(e% g"—!), k > 0. Taking linear 
combinations, we get c(D)(e*x") = xc(D)(e*x"—!) + ¢(D)(e% x"). By 
the inductive hypothesis applied to c and c’, we obtain 
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(D)(e*a") = 2 fae(a)et* + e'(a)e" 
Te Dina a3 
= or plela)e"*] 
o” ax 
= gan lelaje ] 


Thus (A.5.2) is true for all n > 0. 
5.9.4 Change variable xt = s, xdt = ds. Then with |f(t)| <C,0<t< 1, 


1 xz 
aa etre a = if e*f(s/x)s" ds 
0 0 


< of e *s"ds =CI(n+1). 
0 


This shows that the integral is O(a~"~*). 


5.9.5 Note that fae e~* dt = e~*/zx and, by differentiation under the integral 
sign, 


Co dP [oe) ‘:p —x2 
/ oP? dt = (1)? / ent dt = (1? 8 = Rae-* 
i 1. 


dxP dxP «x 


for some rational function R. But e~? = O(a") for all n > 1 implies 


R(«)e—* = O(a”) for all n > 1. Thus the integral is = 0. 


5.9.6 If the Stirling series converged at some point a, then By /n(n—1)a"~! 
n > 1, would be bounded by the nth term test. Then the Bernoulli series 
would be dominated by 


on gn 


<Daailleley” 


which converges for all «. But we know (§5.6) that the radius of convergence 
of the Bernoulli series is 27. 
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Solutions to Exercises 6.1 


6.1.1 If (6.1.2) holds, then the range of f is finite. Conversely, if the range 
of f is {c1,...,ew}, let Aj ={a: f(z) =o}, 7 =1,...,N. 
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6.1.2 Let M Cc R? be measurable. Since f(A MB) = f(A) f(B) and 
F(A%) = F(A) 
area (A) = area (f~*(A)) 
= area (f~'(A) NM) + Aarea(f~'(A)N M*) 
= area (f(f~'(A) 1 M)) + area (f(f7*(A) NM M°)) 
= area(An f(M)) + area(An f(M°)) 
= area(AnM f(M)) +area(An f(M)°). 


Thus f(J) is measurable. 


6.1.3 This follows from the countability of Q and 


(ay (x)} = U fe: f(x) > r > g(x)} 
rEeQ 
= | eee xz) >r}n{a:r>g(x)}. 
rEeQ 


6.1.4 If f is measurable, then 
{a:—f(x) < M} = {a: f(x) > —M} = {a: f(x) < —M}° 


is measurable. 


6.1.5 This follows from the countability of Q and 


{x: f(e)+9(@)<M}= LU {#:f@) <r} {w:9(2) <5}. 


r,s€Q 
r+s<M 


6.1.6 For f > 0, g => 0, this follows from the countability of Q and 


{x: f(a)g(x) < M}= [J fe: f(a) <r} {x : g(a) < 5}. 


r,sEQ 
rs<M 


For general f,g, write f= f*—f-,g=g9'-g°. 


Solutions to Exercises 6.2 


6.2.1 Given az € U, let az = inf{t € U: (t,x) C U} and b, = sup{t EU: 
(x,t) CU}. Then —co < ag < @ < bz < wand Ty = (az, bz) CU. If ay € U,”7 
then there is an open interval J; C U containing az. But then I; U Jz C U, 
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contradicting the definition of a,. Thus a, ¢ U. Similarly b, ¢ U. Moreover 
the same argument shows I[,, I,” are either disjoint or identical, for all x, x’ 
in U. Since {I, : x € U} contain distinct rationals, {I, : « € U} is finite or 
countable. 


6.2.2 Suppose F' is continuous, U is open, and F(c) € U. Given « > 0, there 
is a 6 > 0 such that |x — c| < 6 implies |F'(x) — F(c)| < €. Choose € small 
enough so that the interval (F'(c) — €, F(c) + €) is in U. Then the interval 
(c — 6,c+ 6) is in F~1(U). Thus F~1(U) is open. Conversely, the interval 
U = (F(c) — «, F(c) + €) is an open set, so its inverse image F~1(U) is an 
open set containing c. Thus there is an open interval I with c€ I Cc F~'(U). 
Selecting 6 > 0 such that (c — 6,c +6) C I establishes the continuity of F. 


6.2.3 If I is an open interval, then F(Z) is an interval, hence measurable. 
Now 
F (U i) = Fun); 
k=1 k=1 
hence U open implies F'(U) measurable. 


6.2.4 Let U, = 1, U...UTI,. Apply (4.3.5) to flu,, and the partition 
-—w <cq<d<a@<dg<-:+<a,<d,<o@w 


to get 
n dk 
f(x) dx = f(a) da. 
[Joey 


Now send n — oo and use the monotone convergence theorem. 


6.2.5 The limit of (fn(x)) exists iff (f,(x)) is Cauchy iff 
inf sup |f,(0) — fu(e)| = 0. 
n215>n 


Thus (fn(x)) is not cauchy iff x is in 


lo Oe Cm ©) 


Ae = |) () Ufa: file) — fa(a)| > 1m}. 


m=1n=1 j=n 


But the countable unions and intersections and complements of measurable 
sets are measurable. Since A is measurable, f,14, n > 1, is measurable and 
converges pointwise to f. Thus f is measurable. 
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Solutions to Exercises 6.3 


6.3.1 length (17) = area(M x (0,1)), so M negligible in R implies M x (0, 1) 
is negligible in R?; hence M x (0,1) is measurable in R? (Exercise 4.5.17); 
hence M is measurable in R. 


6.3.2 A C B implies A x (0,1) C B x (0,1) and 


UA, x (0,1)) = (U 4] x (0,1) 


Now use monotonicity and subadditivity of area. 


6.3.3 If |A— B| is negligible, then so is A— B and B— A. But B is the union 
of B — A and A —(A-— B). Thus A measurable implies B measurable. 


6.3.4 The symmetric difference of {x : f(x) < m} and {x : g(x) < m} is 
contained in {a : f(a) 4 g(x)}. 
6.3.5 Let J denote the inf. If (7,) is a paving of A in R, then (J, x (0,1)) is 
a paving of A x (0,1) in R?; hence 

length (A) = area(A x (0,1)) 


< So Mn x 0,1) 


n=1 
lee) 

=o ell 
n=1 


and thus length (A) < J. On the other hand, given € > 0, there is an open 
U CR containing A with 


length (U) < length (A) +. 


Write U as the countable disjoint union of intervals I,, n > 1, to get by 
Exercise 6.2.3 


I< > \|Zn|| = length (U) < length (A) + e. 


n=1 
The result follows. 


6.3.6 Let UZ, I, be a paving of N by intervals. Then length(AN I.) < 
alength (J;,), so 


length (A oo) length (AN Ik) < ay. ianetht (Ix). 
k=1 k=1 


Taking the inf over all pavings, length(A) < alength(A). Hence 
length (A) = 0. 
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6.3.7 By induction, F,,([0,1]) C [0,1], n > 0. Assume F), is increasing. Since 
O<a< 1/3 <y < 2/3 < z <1 implies Fr4i(z) < Frii(y) < Fn4i(z), it 
is enough to show F),41 is increasing on each subinterval [0, 1/3], [1/3, 2/3], 
[2/3, 1]. But this is clear from the definition. Thus by induction F;,, n > 0, 
is increasing. Piecewise linearity is also clear by induction, as is 0 < FY (x) < 


(3/2)”. 


6.3.8 Cn41 < en/2 is clear by induction, so for m > 1, €n4m < 27” en; hence 


ntm-1 aS 
|Fntm(t) — Fala) $Y) [Faer(@) — Fa(a)| < ¥> 2-Keo = 27" eo. 
kan k=n 


Hence (F;,(x)) is Cauchy. Let m — oo in the last inequality to get 
|F (x) — F,(x)| < 27" 1 ep. 


Thus F,, > F uniformly on [0,1], and F' satisfies the recursive identity. If 
x = (d+ y)/3 with d equal to 0 or 2 and 0 < y < 1, then by the identity, 
F(a) = (d/2)/2+ F(y)/2. Thus 


Let N — o to conclude F' is the Cantor function. 


6.3.9 Since F;, is increasing and F,, > F’, F is increasing. Since F'((0,1]) = 
(0, 1], F cannot have any jump discontinuities; hence F’ is continuous. 
We establish 


Fin(z) — Fn(a) < Fr(z — 2), O0<a<z<l1,n>0, (A.6.1) 


by induction. For n = 0, (A.6.1) is clear. Assume (A.6.1) is valid for n. To 
verify (A.6.1) for n+1, there are six cases. (1) v < z < 1/3, (2) 2/3<a<z, 
(3) 1/8 <a <z< 2/3, (4) a < 1/3 < 2/3 < z, (5) 1/3 < & < 2/3 < z, (6) 
x < 1/3 < z < 2/3. Cases (1) and (2) are straightforward, there is nothing 
to prove in case (3), case (4) depends on whether 3z—2 > 3x or not, case (5) 
is reduced to case (2) by replacing « by 2/3, and case (6) is reduced to case 
(1) by replacing z by 1/3. This establishes (A.6.1). Sending n — oo yields 
F(z) — F(a) < F(z- 2). 
We establish 
F(x) < «%, 0<ar<l1.n>0, (A.6.2) 


by induction. For this let f(a) = (a — 2/3) + (1/3)% and g(a) = a for 
2/3 < « < 1. Then f’(x) > g/(x) and f(1) = g(1). By the mean value 
theorem, f(x) < g(x); hence (Figure 6.1) 
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2\° 1\°* 2 
cane Sy = < yo —<<— y < 1, A.6. 
(« =) +(5) <a, goes (A.6.3) 


For n = 0 (A.6.2) is clear. Assume (A.6.2) is valid for n. Then there are three 
cases in verifying the validity of (A.6.2) for n+ 1, the last case using (A.6.3). 
This establishes (A.6.2). Sending n — co yields F(a) < x 

The inequality F(a) > (a#/2)° is established directly, because F;,(”) > 
(2/2) is false for all n > 0. For this we use the inequality 


S- xu > (© fn) ; Ln > 0. (A.6.4) 


Let A = {)> a, = 1,2, > 0} denote the simplex of all nonnegative sequences 
(%,) summing to 1 and let h(a1,22,...) = So ae. Since (a*)” = ala — 
1)x°—-? < 0, x® is concave, so h is concave. Since h = 1 at the vertices of A, 
we have h > 1 on A, yielding (A.6.4). By (A.6.4), 


Fos), al? — ae (=) are (> ‘) = (2/2). 


Solutions to Exercises 6.4 


6.4.1 Let c ¢ We. If tn > ct, then c < z, < 2n, so G(an) = G(z,) 9 
G(c) = G(c). Similarly if 2, > c—. Thus G is continuous on W6. Since G 
is constant on each component interval, G is continuous on Wg. Thus G is 
continuous on [a,b]. _ , 

Suppose c < x with G(c) 4 G(x). Then €< z. 

If a < c, then a < @ ¢ We, so G(c) = G(®@ > G(z) = G(a). If c= ais 
the left endpoint of a component interval, let d be the corresponding right 
endpoint. Then d < x and a < d ¢ Wa, so G(c) = G(c) = G(d) > G(x) = 
G(x). If c = a is not the left endpoint of a component interval, there exists 
C< Gy <2, Cn € Woe, Gy > ©. Then cn < 2 and a < ty ¢ We so G(c) = 
G(c) = limy_so0 G(en) > G(x) = G(x). Thus G is decreasing. 


6.4.2 ce {f* > A} iff there exists n > 1 with |c| < n and there exists x > c 
with |z| < n satisfying G,(2) > G,(c) which happens iff c € Ug,,. By the 
sunrise lemma, if (c,d) is one of the component intervals of Ug, C (—n,n), 


G,,(d) > Gn(c) or 
d 
a—c< | f(x) dx 


Summing over all component intervals, 
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A length (UG,,) < f(x) da. 
Uen 


But Ug, CR, n => 1, is increasing. Now use the method of exhaustion and 
the monotone convergence theorem. 


6.4.3 Apply the second part of the sunrise lemma to G(x) = F(a) — Av on 
[a, b] with A > Ao, and set W, = Wa. Then we have G(x) < G(c) for x >c 
and a < c ¢ Wy, which implies D. F(z) < A for > c anda <c ¢ Wy. 
Moreover D,F(d) = for each component interval (c,d) of Wy. Since F' is 
increasing, we have 0 < D. F(x) < \ for « > c and a < c ¢ Wy. But this 
implies 0 < F’(c) < A at every Lebesgue point of F’; hence the result follows. 


6.4.4 Let (c,d) be a component interval of Wg with a < c. Then G(c) = G(d) 
and G(c) > G(x) for x > c. If ce < e € We and G(c) = G(e), then there is 
x >e>cwith G(x) > G(e) = G(c) implying c € We. But this contradicts 
6 a We. 


Solutions to Exercises 6.5 


6.5.1 If U’ Cc U, then each component interval of U’ lies in U,, for exactly 
one n > 1. Hence 


var(F,U') = S° var(F,U’ Un). 


n=1 


Since var(F,U’NU,,) < vr(U;,) and U’ Cc U is arbitrary, we have 


Conversely, given € > 0, for each n > 1, select U!, C U, satisfying 
var(F’,U/,) > vr(U;,) — 27". Let U’ be the union of U/, n > 1. Then 
U'NU, = Ul, n> 1,80 


up(U) > var(F,U") = S- var(F, US S> ur (Un) —e. 
n=1 n=1 


Since € > 0 is arbitrary, the result follows. 


6.5.2 Suppose F' is absolutely continuous in the sense of §6.3. Then given 
e€ > 0, there exists 6 > 0 such that var(F,U) < e for any finite disjoint 
union U of intervals satisfying length (U) < 6. Suppose U, C (a, 0) satisfies 
length (U,) — 0 and select N > 1 such that length(U,,) < 6 forn > N. 
For each n > 1, select U!, C U,, satisfying ur(Un) < var(F, U/,) + €. For each 
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n > 1, order the component intervals of U/,, and let UF denote the union of the 
first & of these intervals. (If U/, has only finitely many component intervals, 
then U® doesn’t depend on k for k large.) Now U¥ is a finite disjoint union of 
intervals and length (UR) < 6; hence var(F,U) < € for k > 1,n > 1. Letting 
k + oo yields var(F,U/,) < €, n > 1; hence 


ur(U,) < var(F,U!) +e < 2e, n>N. 


This establishes vr(U;,) + 0; thus F' is absolutely continuous in the sense of 
86.5. Conversely, suppose F' is not absolutely continuous in the sense of §6.3. 
Then there exists € > 0 and finite disjoint unions U,, satisfying length (U,) < 
1/n and vr(U,) > var(F,U,,) > €. Hence F is not absolutely continuous in 
the sense of §6.5. 


6.5.3 First we note var(F, (c,d)) = |F(d) — F(c)|. Let a= 2p < a1 < +++ < 
Ln = b be a partition of [a, b]. If U = (c,d), by the triangle inequality 


var(F’,U) < S\var(F, UN (ap-1, 2k))- 
k=l 


Applying this inequality to each component interval of an open U C (a,b) 
and summing over these intervals yields the same inequality but now for open 
U c (a,b). Thus 


var(F’,U) < S- Up (Xp-1, Lk)- 
k=1 


Taking the sup over all U C (a,b) yields 


up(a,b) < S- UF (Lk-1, Lk). 


k=1 


On the other hand, U7_,(tx-1,%%) C (a,b). Since (xp_-1, 2%), k > 1, are 
disjoint, by Exercise 6.5.1, 


ur(a, b) > UP (U a) = S> ur(@e-1, 2k). 


k=1 k=1 


Thus vp(a, b) = vr (@0, 41) +--+: +ur(@n-1,0n). Since F is absolutely contin- 
uous, there is an n > 1 such that vr(U) < 1 when length (U) < 1/n. Select 
the partition to have mesh < 1/n. Then vp(ap_-1, 2%) < 1, k = 1,...,7; 
hence vp(a,b) <n. 


6.5.4 If U is open, since vp is increasing, by Exercise 6.5.1 applied twice, 
Exercise 6.5.3, and summing over the component intervals of U, 


Vyp(U) = S- up(a,d) — up(a,c) = S- ur(c,d) =vr(U). 


(c,d) (c,d) 
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6.5.5 Since F is Lipschitz, var(F, (c,d)) = |F(d)—F'(c)| < M(d—c). Applying 
this to the component intervals of an open U and summing over these intervals 
yield var(F,U) < Mlength(U). Thus vr(U) < Mlength(U). This implies 
absolute continuity. 

6.5.6 If we call the set in Theorem 6.5.4 US and the set in Theorem 6.5.5 
V., the other two sets are 


Uy ={c € (a,b): D-F (x) > A for some x < c} 
V,* = {c € (a,b) : DeF(x) < » for some z > c}, 


and the estimates they satisfy are the same as those of Uf and Vy, re- 
spectively. The proof follows by applying the results for Uf and Vy to 
F(x) = —F(—2). 

6.5.7 LetO<c<a<1.Thenc¢ C\Z iff there is 6 > 0 with (c,c+6) C C* 
which happens iff the graph of F' on (c,1) is below the line passing through 
(c, F(c)) with slope Ap = (1— F(c))/6. Hence 0 < D. F(x) < Xo andc ¢ Wy. 
Conversely, suppose c € C'\ Z and let c = >, d,/3”. Then for all n > 1, 
there is an N > n with dy = 0. If we let x, =c + 2/3, we obtain rp € C, 
In > C, with 2D.F (an) = (3/2)% > (3/2). Hence c € Wy for all \ > 0. 
6.5.8 We assume F’ is bounded variation continuous on [a, 6] and F' has been 
extended to all of R by F(a) = F(b) for « > b and F(x) = F(a) for « <a. 
Then up(a,x) = up(a,b) for « > b. For n > 1, with f, as before, 


b b 
/ |fn(a)| dx < a up(x2, x2 +1/n) dx 


(a ve(ae+1/n)de~ f 


a 


—— b 

n (a (ase) de — | up(a, x) dx 
ce a 
— atl1/n 

=n i: (a (ay) de — | up(a, x) dx 
b a 


atl1/n 
= vp(a,b)—n | ur(a, x) dx < ur(a,b). 


b 


up(a, x) is) 


Now apply Fatou’s lemma as before to get 


b b 
/ f(x) da < timint [ fn(x) dx < vp(a,)). 
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Solutions to Exercises 6.6 


6.6.1 Given an interval [a, b], there are c and d in [a, b] with max,, 4) F = F(d) 
and minj,.») F = F(c). Then F((a, b)) is an interval and 


length (F(a, b))) = aneeed — ne = F(d) — F(c) < ur(a, 0). 


Applying this to each component interval (c, d) of an open set U and summing 
over these intervals yield 


length (F(U)) < S~ length (F((c,d))) < $° vr(e,d) =vr(U).  (A.6.5) 
(c,d) (c,d) 


6.6.2 Given a < }, F is absolutely continuous on [a, b] and NM(a, 6) is negligi- 
ble. Select open supersets U,, C (a,b) of NM (a, b) satisfying length (U,,) > 0. 
Then by (A.6.5), 


length (F(N / (a, b))) < length (F(U,)) < vur(Un) > 0. 


Thus F(N 1 (a,6)) is negligible for all a < b; hence F'(V) is negligible. 


6.6.3 In (6.5.8) replace [a, b] by [x, 7+] and divide by 6 > 0. If x is a Lebesgue 
point of |F’(«)| and vp(a,-) is differentiable at x, the result follows. 


6.6.4 By (6.5.2) with f = F’ and (6.5.3), we have 
ur(a, x) — up(a,c) = ur(c,2) < / |F’(t)| dt, a<c<a<b. 


On the other hand, for x > c, |F(x) — F(c)| = var(F,(c,z)) < ur(¢,z) = 
ur(a, x) — ur(a,c). Hence 


|F (a) — F(c)| < ur(a,x) — vp(a,c) < [ |F’(t)| dt, a<c<a<b. 


Since vr is absolutely continuous, vy is differentiable almost everywhere. 
Divide by x —c and let x > c+. The Lebesgue differentiation theorem yields 


d 
qtr (a c) = |F'(o)|, almost everywhere in (a,b). 


Then (6.5.4) follows from (6.5.2) with f = up. 


6.6.5 Suppose not. Then there is « > 0 and (A,) with length(A,) > 0 
and length(F(A,)) > ¢€ > 0. For each n > 1, choose U, D Ay with 
length (U,) — 0. Then by (A.6.5), 
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ur(U,) > length (F(U;,,)) > length (F(A,)) > € > 0, n>. 


But this contradicts absolute continuity. 


6.6.6 There is a sequence of open sets M C U,, C (a,b) with length (U,,) > 
length (M). If M is measurable, then length (U,, \_M) — 0. For n > 1, select 
open U,,\M C V,, C (a,b) satisfying length (V,,) > 0. By absolute continuity, 


length (F(Un) \ F(M)) < length (F(Un \ M)) 
< length (F'(V,)) < vr(Vn) > 0. 
Let I denote the intersection of F(U;,), n > 1. Then length (I \ F(M)) = 0 


and I > M. Since U,, is open and F is continuous, by Exercise 6.2.2, F(U;,) 
is measurable; hence J is measurable. By Exercise 6.3.3, the result follows. 


6.6.7 While this is an immediate consequence of Theorem 6.6.2, the goal 
here is to derive the result directly from the sunrise lemma, avoiding the 
fundamental lemma. Let A = {F’ = 0}. Then A Cc Vy; hence by (6.5.6) and 
Exercise 6.6.1, 


length (F(A)) < length (F(V))) < vr(Vy) < Alb — a) 
for all A > 0; hence F(A) is negligible. Also by Exercise 6.6.2, N = (a,b)\ A 
is negligible, so is F(N). Hence F‘({a, }]) is negligible or F' is constant. 


6.6.8 Since d € J;, |F(d)—F(x;-1)| < Mj;—m,. Since c € Ji, |F(ai)—F(o)| < 
M;—m,. Now include the intermediate endpoints using the triangle inequality 
to get 


|F(d) — F(o)| < ( + $0 |F(ee) — F(we-1)| + (Mj — mj). 


th 


(If there are no intervals Jy between J; and J;, the sum is zero.) Hence 
IF(d)— F(Q)|< So (Me—me). 
tli 


For each interval (cx, dx), let cn € Ji, i= i(k), dy € Jj, j = 7(k). Then 


N N 
2 |Flde) — Fel < S > (Me- me). 


k=1 1 i(k) Se<j(h) 


If the mesh of a = 2% < 41 < +++ < @n = bis < 6, then each J; contains at 
most one endpoint; hence the intervals [i(k),7(k)] NN, [i(k’), j(k’)] AN do 
not overlap for k 4 k’. (6.6.2) follows. 
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M(a,b), 215 
O(g(x)), 271 
P(x), 193 
N, 9 

~, 271 

B, 244 

7, 179 

[x], 13 

N, 9 

Q, 10 

R, 4 

Z, 10 

m, 116 

w(x), 198, 213, 251, 261 
a(n), 253 
~, 219, 258 
r(x), 244 
0(s), 251 
0+(q), 0-(4), Ao(q), 252 
C(x), 258 

a| 6, 13 

e, 85 

nl, 11 

x!, 194 


A 
absolute value, 16 
absolutely continuous 
locally, 289, 298 
additivity 
of area, 184 
well-separated, 146 
affine map, 147 


formula, 216 
functional equation, 219 
algebraic order, 71 
almost 
all, 286 
everywhere, 286 
analytic, 106, 107 
angle, 118, 181 
arccos, 118 
Archimedes’ constant, 116 
arcsin, 117 
arctan, 119 
area, 137 
additivity, 146, 184 
affine invariance, 147 
dilation invariance, 138 
method of exhaustion, 182 
monotonicity, 140 
naive, 137 
of a cover, 140 
of a parallelogram, 147 
of a rectangle, 142 
of a triangle, 145 
reflection invariance, 140 
rotation invariance, 147 
subadditivity, 140 
translation invariance, 138 
arithmetic-geometric mean, 215 
asymptotic 
equality, 219, 233, 258 
expansion, 271, 272 


AGM, 215 axiom of countable choice, 3, 141 
curve, 254 axiom of finite choice, 3, 16, 52 
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B 


Bailey—Borwein—Plouffe series, 200 


Banach, 277 
-Zaretski theorem, 309 
indicatrix, 309 


Bernoulli 
function, 244 
number, 244 


series, 244, 246 
Bessel function, 202, 213, 232 
beta function, 228 
bijection, 2 
binomial 

asymptotics, 237 

coefficient, 95, 238 

entropy, 237 

theorem, 95, 111 
bounded 

variation, 56 


C 
Calderén-Zygmund, 296 
Cantor function, 289 
Cantor set, 133, 288 
area of the, 150 
Caratheodory, 146, 277 
Cauchy order, 41 
Cauchy sequence, 36 
Cauchy—Schwarz inequality 
for integrals, 179 
chain rule, 77 
closed 
interval, 16 
rectangle, 136 
set, 49 
codomain, 2 
compact 
interval, 16, 49 
rectangle, 136 
compactness, 49, 52 
countable, 51 
sequential, 50 
complement, 2 
completeness property, 5 
component interval, 285 
composite, 13 
continued fraction 
a, 200, 213 
V2, 20 
convergents, 206 
golden mean, 47 


absolute, 297 

at the endpoints, 162 
from the left, 61 

from the right, 61 
local absolute, 289, 298 
modulus of, 60 


under the integral sign, 210, 284 
under the summation sign, 211 


uniform, 63 
uniform modulus of, 63 
convergence 
product, 240 
quadratic, 215 
sequences, 24 
series, 31 
sub, 28 
convergents, 206 
convexity, 96 
Legendre transform, 97 
maximum principle, 97 
subdifferential, 97 
cosecant, 120 
cosh, 102 
cosine 
arclength, 181 
power series, 114 
cotangent, 120 
coth, 242 
countably compact, 51 
cover 
countable, 51, 136 
finite, 51, 136 
open, 51 
critical 
point, 78 
value, 78 


D 

De Morgan’s law, 2 

decimal 
expansions, 32 
point, 33 

delta function, 167 

derivative, 73 
Dini, 299 
linearity, 74 
partial, 224 

determinant, 147 

diameter, 145 

differentiation, 73 
power series, 103 


irrational, 46 under the integral sign, 224, 284 
rational, 15 under the summation sign, 231 
continuity, 57 digamma function, 250 


Index 


digit, 13 
dilation, 7, 125, 138 
centered, 139 
Dini derivative, 299 
discontinuity 
jump, 59 
mild, 59 
removable, 59 
wild, 59 
divisible, 13 
domain, 2 
dominated convergence theorem 
for integrals, 200, 283 
for series, 210 
domination, 201 
duplication formula, 238 


dyadic, 280 
E 
element, 1 


elementary symmetric polynomial, 99 
AGM, 223 
inequalities, 99 

entire, 106-108 

entropy, 238 

error sequence, 36 

Euler 
—Maclaurin derivative, 268 
—Maclaurin formula, 271 
arithmetic progressions, 264 
constant, 179 
gamma function, 193 
product, 263 

even, 10, 80 
part, 80 

expansion 
base of, 33 
binary, 33 
decimal, 33 
hexadecimal, 33 
ternary, 33 


F 
factor, 13 
factorial 
of a natural, 11 
of a real, 194 
Fatou’s Lemma, 197 
flip, 140 
Fourier transform, 212, 232 
fractional part, 13 
function, 2 
absolutely continuous, 297 
affine, 89 


algebraic, 68 

beta, 228 
bounded, 151 
Cantor, 289 
concave, 89 
continuous, 57 
convex, 89 
decreasing, 55 
differentiable, 73 
elementary, 270 
exponential, 70 
gamma, 193 
increasing, 55 
indicator, 279 
Lipschitz, 295 
logarithmic, 70 
measurable, 278 
monotone, 55 
negative part, 152 
nonnegative, 151 
positive part, 152 
power, 67, 69 
rational, 59 
signed, 152 

simple, 279 
smooth, 88 

strictly concave, 89 
strictly convex, 89 
strictly decreasing, 55 
strictly increasing, 55 
strictly monotone, 55 
theta, 252 
transcendental, 69 


functional equation 


AGM, 219 
theta, 252 
zeta, 262 


fundamental lemma, 306 
fundamental theorem 


of arithmetic, 15 
of calculus 
first, 169, 287 
second, 170, 290 


G 


Gamma function, 193 


Gauss AGM formula, 216 


Gaussian 
function, 223 
integral, 223 
golden mean, 47 
greatest integer, 13 
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H f2, 12, 26 
heat equation, 258 Jn, 20 
homogeneity, 215 e, LL 
hyperbolic cotangent, 242 
L 
I l’Ho6pital’s rule, 86 
image, 2 Laplace 
inductive theorem, 235 
hypothesis, 10 transform, 167, 203, 227, 231, 251 
set, 9,14 Lebesgue, 277 
step, 10 density theorem, 294 
inf, 5 differentiation theorem, 287 
is attained, 6, 62 measure 
infimum, 5 one-dimensional, 286 
infinite product, 240 two-dimensional, 137 
for I'(x), 249 point, 286 
for sine, 247 Legendre 
for sinh, 240 polynomial, 178 
injection, 2 transform, 96 
integrability length 
local, 286 of graph, 180 
of nonnegative functions, 152 of interval, 137 
of signed functions, 153 liminf, 24 
integral limit 
additivity, 157 from the left, 54 
continuity, 164 from the right, 54 
continuity at the endpoints, 162 left lower, 55 
dilation invariance, 156 left upper, 55 
integration by parts, 173 lower, 24, 53 
linearity, 173, 195, 202, 277, 281, 283, of a sequence, 24 
284 of functions, 53 
monotonicity property, 155 of monotone sequence, 22 
of nonnegative functions, 151 point, 30, 53 
of signed functions, 153 right lower, 55 
Riemann sum, 165 right upper, 55 
substitution, 174, 311 upper, 24, 53 
test, 164 limsup, 24 
translation invariance, 156 linear map, 147 
interior, 134 linearity 
intermediate value property derivative, 74 
for continuous functions, 62 integral, 173, 195, 202, 277, 281, 283, 
for derivatives, 87 284 
intersection, 1, 136 primitive, 124 
interval, 16 Lipschitz 
closed, 16 approximation, 295 
compact, 16 function, 295 
component, 285 log-convex, 198 
open, 16 lower bound, 4 
punctured, 53 lower sequence, 23 
inverse function theorem Lusin’s property, 309 
for continuous functions, 66 
for differentiable functions, 82 M 
irrational, 18 Machin’s formula, 128 


mw, 179 mapping, 2 
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affine, 147 
bijective, 2 
composition of, 3 
injective, 2 
invertible, 3 
linear, 147 
surjective, 2 
maximal inequality, 296 
maximum, 6 


global, 78 
local, 78 
mean 


arithmetic, 214 
arithmetic-geometric, 215 
geometric, 214 
mean value theorem, 79 
generalized, 86 
measurable 
function, 278 
set in R, 278 
set in R?, 184 
mesh 
of a partition, 65 


method of exhaustion, 132, 182, 183 


minimum, 6 
global, 78 
local, 78 
monomial, 58 
monotone convergence theorem 
for integrals, 194 
for series, 196 
multiplicative, 265 


N 

negligible set, 286 

number 
algebraic, 71 
integer, 10 
natural, 9 
rational, 10 
real, 4 


transcendental, 71 


O 

odd, 10, 80 
part, 80 

open 
cover, 51 
interval, 16 
rectangle, 136 


set 
in R, 50, 285 
in R?, 183 


subcover, 51 


P 

parity, 10 

partial product, 11 

partial sum, 11, 31 

partition, 64 
mesh, 65 

paving, 136 
open, 142 

period, 117, 122 

pi, 116 

piecewise 
constant, 65, 160 
continuous, 173 
differentiable, 179 

polynomial, 58 
coefficients of a, 58 
degree of a, 58 
elementary symmetric, 99, 222 
growth, 235 
minimal, 71 

power series, 100 
convergence, 100 
differentiation, 103 
ratio test, 101 
root test, 101 

primes, 13 
arithmetic progressions, 264 

primitive, 122, 289 
existence, 122, 290 
integration by parts, 124 
linearity, 124 
substitution, 124 
uniqueness, 123, 290 

product rule, 74 


Q 


quadratic formula, 19 
quotient rule, 74 


R 
Raabe’s formula, 239 
radius of convergence, 101 
of Bernoulli series, 246 
range, 2 
ratio test, 101 
rearrangement, 40 
rectangle, 136 
bounded, 136 
closed, 136 
compact, 136 
open, 136 
reflection, 7 
relation, 2 
remainder, 109 
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Cauchy form, 110 
integral form, 176 
Lagrange form, 110 
Riemann integrable, 167 
Riemann sum, 161 
Riesz, 277, 292 
root, 81 
n-th, 18 
of order n, 98 
square, 12 
root test, 101 


s 

sawtooth formula, 260 

secant, 120 

sequence, 20 
bounded, 24 
Cauchy, 36 
convergent, 28 
decreasing, 21 
error, 36 
finite, 20 
increasing, 21 
lower, 23 
monotone, 22 
of sets, increasing, 182 
sub-, 28 
upper, 23 

series, 31 
nth term test, 34 
absolutely convergent, 37 
alternating, 36 
alternating version, 39 
Cauchy product, 45 
comparison test, 32 
conditionally convergent, 37 
convergent, 31 
Dirichlet test, 38 
double, 41 
geometric, 31 
harmonic, 34 
Leibnitz, 128, 199 
Leibnitz test, 39 
power, 100 
signed, 36 
Stirling, 271 
summable, 43 
tail of a, 31 
telescoping, 34 

set, 1 
bounded, 5 
closed, 49, 190 
countable, 41 
disjoint, 2 
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empty, 1 

finite, 10 

inductive, 9 

infinite, 10 

interopen, 187 

measurable, 184, 278 

negligible, 286 

of differences, 191 

open in R, 50, 285 

open in R?, 183 

product, 2 

sub-, 1 

super-, 1 
sine 

arclength, 181 

power series, 114 
sinh, 102 
Stirling 

approximation, 233 

identity, 233 

series, 272 
subconverges, 28 
subdifferential, 97 
subgraph, 151 
summation by parts, 38 
summation under the integral sign 

absolute case, 202, 284 

alternating case, 202, 284 

positive case, 195, 283 
sunrise lemma, 293 
sup, 5 

is attained, 6, 62, 72 
superlinear, 72, 95, 97 
supremum, 5 
surjection, 2 
surreal numbers, 8 


T 
tangency 
of functions, 76 
of lines, 76 
tangent, 119 
Taylor series, 106 
Taylor’s theorem, 109, 176 
theta 
AGM curve, 254 
function, 251 
functional equation, 252 
transcendental 
function, 69 
number, 71 
translation, 7, 125, 138 
triangle inequality, 17 
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U 
uncountable, 41 
uniformly continuous, 63 
union, 1,51, 136 
unit circle, 118, 180 
unit disk, 118 
area of, 172 
upper bound, 4 
upper sequence, 23 


Vv 
variation, 56, 297 
bounded, 56, 297 


total, 56, 297 
Vieta formula, 200 
Vitali, 277, 289 


WwW 
Wallis product, 199 
well-separated, 145 


Z 

zeta 
function, 244, 258 
functional equation, 262 
series, 250 
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